4.3 Kazakh adjective phrase structure An adjective is a word that describes a noun or pronoun.The Kazakh adjective phrase divided by
the function of phrase like follow and attached 2 .
1)adj+n; 2)adj + v; 3) adj+n+v; 4)pron+adj; 5) adv+adj+n; 6) adj+adj+n; 7) num+adv+n ;
5 Rule-based verb phrase recognition algorithm Kazakh has two characteristics that have to be taken into account: agglutinative morphology,
and rather free word order with explicit case marking.
Input:word segmentation(extraction stem and affix) and POS tagged corpus (test.xml);
Output:First:Phrase tagged file(result.xml);Second:Phrase file(resultP.txt);
Rule-based phrase recognition algorithm as follow:
(1)i=1;
(2)while (!(test.xml))
①From right to left match rule in rule base;
②if match then put phrase boundary and phrase POS tag.
③i=i+1 (move right)
(3)Output recognition phrase and phrase file.
Based on the basic rules of phrase, we have done extraction of phrases from POS tagged Kazakh
corpus. The extraction process is as follows:
(a) First roughly segmented XML corpus. The common segmentation marks include semicolon,
comma, full stop, exclamation mark, question mark.
(b) For the segmented data, we extract the three elements of basic phrase: part of speech (POS),
affix, and the word.
(c) Look for the matched rule in the rule set. If found, save the basic phrase. Otherwise go back
step 1.