dc.contributor.author |
Mamyrbayev, Orken |
|
dc.contributor.author |
Oralbekova, Dina |
|
dc.contributor.author |
Alimhan, Keylan |
|
dc.contributor.author |
Turdalykyzy, Tolganay |
|
dc.contributor.author |
Othman, Mohamed |
|
dc.date.accessioned |
2024-10-18T10:00:02Z |
|
dc.date.available |
2024-10-18T10:00:02Z |
|
dc.date.issued |
2022 |
|
dc.identifier.issn |
20452322 |
|
dc.identifier.other |
DOI 10.1038/s41598-022-12260-y |
|
dc.identifier.uri |
http://rep.enu.kz/handle/enu/17980 |
|
dc.description.abstract |
Today, the Transformer model, which allows parallelization and also has its own internal attention,
has been widely used in the feld of speech recognition. The great advantage of this architecture is the
fast learning speed, and the lack of sequential operation, as with recurrent neural networks. In this
work, Transformer models and an end-to-end model based on connectionist temporal classifcation
were considered to build a system for automatic recognition of Kazakh speech. It is known that
Kazakh is part of a number of agglutinative languages and has limited data for implementing speech
recognition systems. Some studies have shown that the Transformer model improves system
performance for low-resource languages. Based on our experiments, it was revealed that the joint
use of Transformer and connectionist temporal classifcation models contributed to improving the
performance of the Kazakh speech recognition system and with an integrated language model it
showed the best character error rate 3.7% on a clean dataset. |
ru |
dc.language.iso |
en |
ru |
dc.publisher |
Scientific Reports |
ru |
dc.relation.ispartofseries |
Том 12, Выпуск 1;Номер статьи 8337 |
|
dc.title |
A study of transformer‑based end‑to‑end speech recognition system for Kazakh language |
ru |
dc.type |
Article |
ru |