Understanding Emotions with Deep Learning: A Model for Detecting Speech and Facial Expressions
Main Article Content
Abstract
In recent years, significant progress has been made in the realm of artificial intelligence, machine learning, and human-machine interaction. One of the increasingly popular aspects is voice interaction, where people can command machines to perform specific tasks, and devices like smartphones and smart speakers are now integrated with voice assistants like Siri, Alexa, Cortana, and Google Assistant. However, despite these advancements, machines still have limitations when it comes to engaging in conversations with humans as true conversational partners. They struggle to recognize human emotions and respond appropriately, which is why emotion recognition from speech has become a cutting-edge research topic in the field of human-machine interaction. As our reliance on machines becomes more ingrained in our daily lives, there is a growing demand for a more robust man-machine communication system. Numerous researchers are currently dedicated to the field of speech emotion recognition (SER) with the aim of improving interactions between humans and machines. The ultimate goal is to develop computers capable of recognizing emotional states and reacting to them in a manner akin to how we humans do. To achieve this goal, the key lies in accurately extracting emotional features from speech and employing effective classifiers. In this project, our focus was on identifying four fundamental emotions: anger, sadness, neutral, and happiness from speech. To do so, we utilized a convolutional neural network (CNN) in conjunction with the Mel Frequency Cepstral Coefficient (MFCC) as the technique for feature extraction from speech. After conducting simulations, the results demonstrated the superiority of the proposed MFCC-CNN model when compared to existing approaches. This promising outcome brings us closer to realizing a more emotionally intelligent man-machine interaction system, paving the way for more natural and meaningful exchanges between humans and technology.
Downloads
Metrics
Article Details
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.