А corpus-based frequency statistic…
Серия «Филология». № 2(86)/2017
15
based on the use of rules and 86 kinds of grammatical markers of automatic POS tagging. The modern Brit-
ish English LOB corpus in 70s, also collected 1 million times, using 133 kinds of grammatical markers, it is
POS using CLAWS (Constituent Likelihood Automatic Word-tagging System Constitute) achieve automatic
part of speech tagging system by statistical information. In the past year, many researchers have been con-
structed their own language corpus, such as Korean National Corpus, Turkish National Corpus [4], Russian
National Corpus, Chinese Peking University Corpus.
Study on morphology analysis of based-corpus can not only large-scale real language, and language
specific qualitative explanation, corpus analysis provides a new research platform based on the use of lan-
guage, language can be analyzed from the characteristics of the phenomenon of word frequency and syntac-
tic language. The dictionary is written based on the corpus, collocation can search for specific words; vocab-
ulary corpus based development, can survey the vocabulary of grammatical features, using feature; corpus
based techniques can provide language learners with examples of language analysis.
Достарыңызбен бөлісу: