A study of transformer‑based  end‑to‑end speech recognition  system for Kazakh language

Mamyrbayev, Orken; Oralbekova, Dina; Alimhan, Keylan; Turdalykyzy, Tolganay; Othman, Mohamed

Главная
→
Научные статьи
→
01. Публикации в изданиях зарубежных стран
→
Multidisciplinary
→
Просмотр элемента

dc.contributor.author	Mamyrbayev, Orken
dc.contributor.author	Oralbekova, Dina
dc.contributor.author	Alimhan, Keylan
dc.contributor.author	Turdalykyzy, Tolganay
dc.contributor.author	Othman, Mohamed
dc.date.accessioned	2024-10-18T10:00:02Z
dc.date.available	2024-10-18T10:00:02Z
dc.date.issued	2022
dc.identifier.issn	20452322
dc.identifier.other	DOI 10.1038/s41598-022-12260-y
dc.identifier.uri	http://rep.enu.kz/handle/enu/17980
dc.description.abstract	Today, the Transformer model, which allows parallelization and also has its own internal attention, has been widely used in the feld of speech recognition. The great advantage of this architecture is the fast learning speed, and the lack of sequential operation, as with recurrent neural networks. In this work, Transformer models and an end-to-end model based on connectionist temporal classifcation were considered to build a system for automatic recognition of Kazakh speech. It is known that Kazakh is part of a number of agglutinative languages and has limited data for implementing speech recognition systems. Some studies have shown that the Transformer model improves system performance for low-resource languages. Based on our experiments, it was revealed that the joint use of Transformer and connectionist temporal classifcation models contributed to improving the performance of the Kazakh speech recognition system and with an integrated language model it showed the best character error rate 3.7% on a clean dataset.	ru
dc.language.iso	en	ru
dc.publisher	Scientific Reports	ru
dc.relation.ispartofseries	Том 12, Выпуск 1;Номер статьи 8337
dc.title	A study of transformer‑based end‑to‑end speech recognition system for Kazakh language	ru
dc.type	Article	ru