G. Altenbek, X.L. Wang
18
Вестник Карагандинского университета
Finally, in order to do further explanation, for a comparative analysis of the statistics of the Arabic Ka-
zakh alphabet statistical data and the Cyrillic Kazakh alphabet frequency statistics, the Kazakhstan scholar
Makhambetov (Makhambetov et al., 2013). The Cyrillic Kazakh alphabet frequency statistics table shown in
Figure 3. The Table shows that we are ranked first а, е, ы, н, «а, е, ы, н» four letters in Kazakhstan's Ka-
zakh language is also ranked the first four. And the ranking of the «ə, һ» letters in Kazakhstan Kazakh lan-
guage is also ranked low. This experiments show that the frequency of pure Kazakh voice letters all over the
world are basically consistent.
Figure 3. Kazakh Cyrillic letter statistics (Kazakhstan)
The Kazakh word frequency comply with Zipf's law of power law: The Zipf's law is named after the
American linguist George Kingsley Zipf, it states that given some corpus of natural language utterances, the
frequency of any word is inversely proportional to its rank in the frequency Table 3. The Zipf's law formula
is as follows:
.
r
f
r c
Or
γ
,
r
p
cr
here
f — is frequency;
r — is rank;
c — is constant; γ — is parameter.
We use the first experimental data for the relationship between the frequency and length of words
statistical analysis of Kazakh words. Length of the word is calculated based on numbers of letters in the
word.
T a b l e 3
Достарыңызбен бөлісу: