323
(in fact, statistical machine translation is currently offered by Google for two Turkic languages,
Azeri and Turkish). However, in the case of Kazakh, it would be very hard to put together the
necessary amount of sentence-aligned parallel text, and rule-based machine translation, in which
experts write up dictionaries and grammatical rules that are applied by an engine, emerges as a clear
solution; in fact, existing commercial systems for English to Kazakh (Sanasoft
7
, Trident
8
) all appear
to be rule-based.
We are currently engaged in building a free/open-source rule-based machine translation system
from English to Kazakh, and we are using the Apertium free/open-source machine translation
platform (Forcada et al. 2011, http://www.apertium.org) for various reasons. On the one hand, the
platform already contains free/open-source English morphological dictionaries and, what is more
important, Kazakh morphological dictionaries (Salimzyanov et al. 2013) which take care of all of
the morphotactics and morphophonology and provide a basic vocabulary; this allows us to
concentrate our work in two fronts: building the lexical transfer part, that is, a bilingual dictionary
(already underway) and building structural transfer rules (grammatical rules for translation), which
will be the subject of this paper. On the other hand, building free/open-source dictionaries and rules
for English to Kazakh means that they will be freely available,
9
for instance, to build translation
systems for other Turkic languages; this gives a strategic value to our work, as most of the structural
transfer rules will be ready for use with other Turkic languages with little modification or no
modification at all.
10
The paper, which describes work in progress in the Apertium English-to-Kazakh structural
transfer, is organized as follows: Section 323 describes the free/open-source rule-based machine
translation platform, focusing on structural transfer. Section 0 describes the structural transfer rules
currently available to tackle the main syntactic divergences between English and Kazakh; section 0
describes some successful structural translations and some limitations, and, finally, section 0 gives
concluding remarks and outlines future work.
Достарыңызбен бөлісу: