Main Article Content
Speech is a key means of communication. Nowadays, speech is becoming a more common, if not standard, interface to technology. This will be seen within the trend of technology changes over the years. Increasingly, voice is employed to regulate programs, appliances and private devices within homes, cars, workplaces, and public spaces through smartphones and residential assistant devices using Amazon's Alexa, Google's Assistant and Apple's Siri, and other proliferating technologies. This is often achievable with the help of Automatic Speech Recognition (ASR). Automatic Speech Recognition is a process that accurately translates spoken utterances into text. These technologies enable machines to reply correctly and reliably to human voices and supply useful and valuable services. As communicating with computer is quicker using voice instead of using keyboard, so people will prefer such system. Communication among the person is dominated by speech, therefore it’s natural for people to expect voice interfaces with computer. This can be accomplished by developing speech to text which allows computer to translate voice request and dictation into text. The three models in traditional ASR system are acoustic model, language model and lexicon model. The challenges involved in Automatic Speech Recognition are different styles of speech, environment which include background noise and also accent of speaker. To mitigate these challenges, deep learning models are utilized. The main idea is to analyses features of input audio signals such as spectrogram and MFCC and to develop cutting edge deep learning models. The proposed end-to-end model achieved an error rate of 0.60 on Librispeech dataset.
TURCOMAT publishes articles under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This licensing allows for any use of the work, provided the original author(s) and source are credited, thereby facilitating the free exchange and use of research for the advancement of knowledge.
Detailed Licensing Terms
Attribution (BY): Users must give appropriate credit, provide a link to the license, and indicate if changes were made. Users may do so in any reasonable manner, but not in any way that suggests the licensor endorses them or their use.
No Additional Restrictions: Users may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.