A NOVEL SPEECH RECOGNITION SYSTEM USING FUZZY NEURAL NETWORK

Main Article Content

B. Kishore Babu
Dr. Rakesh Mutukuru

Abstract

An important field in digital speech processing is human voice. Speech recognition in humans has long been a hot topic in signal processing and artificial intelligence research. Natural Language Processing (NLP) offers an interdisciplinary subfield called speech recognition that makes it easier for machines to recognize spoken language and transform it into text. To communicate with machines, speech recognition machines has the potential to be helpful. Technology that can have communicates in real time has been made possible. However, there remain additional issues, such as speaker variance due to other factors like age, gender, signal speed, pronunciation variations, noise from the surrounding area etc. Classification of age and gender is important for speech processing. A lot of work has been done to enhance each of these phases in order to get better and more accurate results. The main goal of this analysis is on the integration of machine learning into the speech recognition system. Hence, this analysis presents a novel speech recognition system using Fuzzy neural network. Pre-processing, speech signal segmentation, speech feature extraction, and speaker recognition are the different phases of a speech recognition system. It uses a fuzzy neural network to identify the speaker's age and gender. Accuracy, F1-score, and Precision are used to evaluate the fuzzy model's performance.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Article Details

How to Cite
Babu, B. K. ., & Mutukuru, D. R. . (2020). A NOVEL SPEECH RECOGNITION SYSTEM USING FUZZY NEURAL NETWORK. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 11(3), 2853–2864. https://doi.org/10.61841/turcomat.v11i3.14608
Section
Research Articles

References

K. Žmolíková et al., "SpeakerBeam: Speaker Aware Neural Network for Target Speaker Extraction in Speech

Mixtures," in IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 4, pp. 800-814, Aug. 2019, doi:

1109/JSTSP.2019.2922820.

Livieris, Ioannis E., Emmanuel Pintelas, and Panagiotis Pintelas. 2019. "Gender Recognition by Voice Using an

Improved Self-Labeled Algorithm" Machine Learning and Knowledge Extraction, vol. 1, no. 1, pp. 492-503, 2019,

doi:10.3390/make1010030

P. Agrawal and S. Ganapathy, "Modulation Filter Learning Using Deep Variational Networks for Robust Speech

Recognition," in IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 2, pp. 244-253, May 2019, doi:

1109/JSTSP.2019.2913965.

Noraini Seman, Ahmad Firdaus Norazam, “Hybrid methods of Brandt’s generalised likelihood ratio and short-term

energy for Malay word speech segmentation,” Indonesian Journal of Electrical Engineering and Computer Science, vol.

, no. 1, pp. 283-291, October 2019, doi: 10.11591/ijeecs.v16.i1.pp283-291

Sunanda Mendiratta, Neelam Turk, Dipali Bansal, “A Robust Isolated Automatic Speech Recognition System using

Machine Learning Techniques,” International Journal of Innovative Technology and Exploring Engineering (IJITEE),

vol. 8, no. 10, pp. 2325-2331, August 2019, doi: 10.35940/ijitee.J8765.0881019

F. Tao and C. Busso, "Gating Neural Network for Large Vocabulary Audiovisual Speech Recognition,"

in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 7, pp. 1290-1302, July 2018, doi:

1109/TASLP.2018.2815268.

Dr. A. S Umesh , Prof. Ramesh Patole , Prof. Krishna Kulkarni, 2019, Automatic Recognition, Identifying Speaker

Emotion and Speaker Age Classification using Voice Signal, International Journal Of Engineering Research &

Technology (IJERT), vol. 08, no. 11,November 2019, doi:10.17577/IJERTV8IS110123

V. Mitra, W. Wang, C. Bartels, H. Franco and D. Vergyri, "Articulatory Information and Multiview Features for

Large Vocabulary Continuous Speech Recognition," 2018 IEEE International Conference on Acoustics, Speech and

Signal Processing (ICASSP), Calgary, AB, Canada, 2018, pp. 5634-5638, doi: 10.1109/ICASSP.2018.8462028.

Gnevsheva, K., & Bürkle, D, “Age Estimation in Foreign-accented Speech by Native and Non-native

Speakers. Language and Speech, vol. 63, no. 1, pp. 166-183, 2020, doi:10.1177/0023830919827621

T. J. Sefara and A. Modupe, "Yorùbá Gender Recognition from Speech Using Neural Networks," 2019 6th

International Conference on Soft Computing & Machine Intelligence (ISCMI), Johannesburg, South Africa, 2019, pp.

-55, doi: 10.1109/ISCMI47871.2019.9004376.

M. Chen, X. He, J. Yang and H. Zhang, "3-D Convolutional Recurrent Neural Networks With Attention Model for

Speech Emotion Recognition," in IEEE Signal Processing Letters, vol. 25, no. 10, pp. 1440-1444, Oct. 2018, doi:

1109/LSP.2018.2860246.

Saeid Safavi, Martin Russell, Peter Jančovič, “Automatic speaker, age-group and gender identification from

children’s speech,” Computer Speech & Language, vol. 50, pp. 141-156, July 2018, doi: 10.1016/j.csl.2018.01.001

A. Jati and P. Georgiou, "Neural Predictive Coding Using Convolutional Neural Networks Toward Unsupervised

Learning of Speaker Characteristics," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27,

no. 10, pp. 1577-1589, Oct. 2019, doi: 10.1109/TASLP.2019.2921890.

Z. Tang, L. Li, D. Wang and R. Vipperla, "Collaborative Joint Training With Multitask Recurrent Model for Speech

and Speaker Recognition," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 3, pp.

-504, March 2017, doi: 10.1109/TASLP.2016.2639323.

P. J. Manamela, M. J. Manamela, T. I. Modipa, T. J. Sefara and T. B. Mokgonyane, "The Automatic Recognition

of Sepedi Speech Emotions Based on Machine Learning Algorithms," 2018 International Conference on Advances in

Big Data, Computing and Data Communication Systems (icABCD), Durban, South Africa, 2018, pp. 1-7, doi:

1109/ICABCD.2018.8465403.

G. S. Meltzner, J. T. Heaton, Y. Deng, G. De Luca, S. H. Roy and J. C. Kline, "Silent Speech Recognition as an

Alternative Communication Device for Persons With Laryngectomy," in IEEE/ACM Transactions on Audio, Speech,

and Language Processing, vol. 25, no. 12, pp. 2386-2398, Dec. 2017, doi: 10.1109/TASLP.2017.2740000.

Đ. T. Grozdić and S. T. Jovičić, "Whispered Speech Recognition Using Deep Denoising Autoencoder and Inverse

Filtering," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 12, pp. 2313-2322, Dec.

, doi: 10.1109/TASLP.2017.2738559.

M. Kolbaek, D. Yu, Z. -H. Tan and J. Jensen, "Multitalker Speech Separation With Utterance-Level Permutation

Invariant Training of Deep Recurrent Neural Networks," in IEEE/ACM Transactions on Audio, Speech, and Language

Processing, vol. 25, no. 10, pp. 1901-1913, Oct. 2017, doi: 10.1109/TASLP.2017.2726762.

C. Kurian, "Speech database and text corpora for Malayalam language automatic speech recognition

technology," 2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization

of Speech Databases and Assessment Techniques (O-COCOSDA), Bali, Indonesia, 2016, pp. 7-11, doi:

1109/ICSDA.2016.7918975.

P. Sharma, V. Abrol and A. K. Sao, "Deep-Sparse-Representation-Based Features for Speech Recognition,"

in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 11, pp. 2162-2175, Nov. 2017,

doi: 10.1109/TASLP.2017.2748240.