48
GULILA ALTENBEK
1.College of Information Science and Engineering , Xinjiang University
2. The Base of Kazakh and Kirghiz Language of National Language Resource Monitoring and
Research Center Minority Languages
3.Xinjiang Laboratory of Multi-language Information Technology
Urumqi, Xinjiang , 830046, P.R. China,
IDENTIFICATION OF THE KAZAKH BASIC PHRASES BASED ON THE MAXIMUM
ENTROPY MODEL
Abstract: This paper proposed the definition, classification and structure of the Kazakh basic
phrases, and established a framework for the classification of it according to their syntactic
functions. Meanwhile, the structure of the Kazakh basic phrases were analyzed; and the
determination of the Kazakh basic phrases collocation and extraction of the Kazakh basic phrases
based on rules were followed. The Maximum Entropy (ME) model uses for the identification of the
phrases from texts and achieved a result of automatic identification of Kazakh phrases with an
accuracy of 81.58% based on rules System and additional artificial modification. Design feature of
this ME model join rely on templates of Kazakh Word, part of speech, affixes. Experimental results
show that the accuracy rate reached 91.62%.
Key words: Kazakh
basic phrase; phrase identification ; maximum entropy; rules.
Достарыңызбен бөлісу: