Атты І халықаралық конференция ЕҢбектері

жүктеу/скачать 8,57 Mb.

Pdf көрінісі

бет	231/326
Дата	07.01.2022
өлшемі	8,57 Mb.
	#19269

1 ... 227 228 229 230 231 232 233 234 ... 326

Introduction
Speech  recognition  is  a  process  of  automatic  conversion  of  human  speech  into  corresponding
text.  Modern  automatic  speech  recognition  systems  (ASR)  advanced  significantly  from  simple
speaker-dependent  word  recognition  to  speaker-independent  large  vocabulary  continuous  speech
recognition  for  broadcast  news  and  telephone  conversation  transcriptions.  Despite  of  widespread
use  of  such  systems  in  daily  life,  most  of  them  are  concerned  with  the  languages  like  English,
German,  Japan,  Russian,  etc.  As  for  Kazakh  language,  it  is  still  underrepresented  in  speech
recognition  research.  Thus,  the  primary  goal  of  this  work  is  to  build  a  baseline  large  vocabulary
continuous speech recognition system.
Fig.  1  outlines  a  standard  architecture  of  a  modern  ASR  system,  which  includes  feature
extraction and pre-processing, acoustic and language modeling, system combination and decoding.
First step to build such a system for Kazakh would be collecting enough audio data, and creating the
acoustic and language models. This is exactly the way we approach the problem.
This  paper  presents  an  acoustic  database  of  Kazakh  speech  in  Section  2,  the  experiments  and
conclusions are given in Sections 3 and 4, respectively.

Speech         Text

232

Figure 1. The architecture of a standard ASR system.

жүктеу/скачать 8,57 Mb.

Достарыңызбен бөлісу:

1 ... 227 228 229 230 231 232 233 234 ... 326