thesauruses, morphological analyzer etc.
schemes, algorithms of systematization etc.
1.
The model of information system with the support of narrow search
mechanisms is developed;
2. The trilingual thesaurus of ‘Information Technology’ subject area is created,
containing the terms in the Kazakh, Russian and English languages (total 21672 terms);
3. A new algorithm of morphological analysis for reduction of the Kazakh words
to the normal form is suggested and implemented considering the complicated
morphology and orthography of the Kazakh language;
4. The algorithm of text coordinate indexing is suggested and implemented
which can be applied for solving the tasks of clustering and thematic classification of
documents;
5. With the use of thesaurus the algorithm of thematic classification of
documents is suggested and implemented, defining the extent of proximity between the
heading and publications similar with its theme.
Достарыңызбен бөлісу: