Emotion Recognition in Speech: Exploiting ResNet50 and Attention Mechanism

Priya, V. Savithri Padma and Meyyappan, Senthilkumar and Vallathan, G. and Ammai, N. Meenatchi (2024) Emotion Recognition in Speech: Exploiting ResNet50 and Attention Mechanism. In: UNSPECIFIED.

Full text not available from this repository.

Abstract

This research explores the application of ResNet50 combined with an attention mechanism for emotion classification in speech signals. With the increasing need for effective emotion recognition systems in applications such as virtual assistants and mental health monitoring, leveraging advanced deep learning techniques becomes crucial. ResNet50, a deep convolutional neural network, excels at extracting complex features from audio representations like spectrograms and Mel-frequency cepstral coefficients (MFCCs). By integrating an attention mechanism, the model can focus on significant temporal and spectral features, enhancing its ability to capture emotional nuances. This study evaluates the proposed model's performance on benchmark emotion datasets, comparing it with traditional approaches and other deep learning architectures. Results indicate that the ResNet50-attention model achieves superior classification accuracy and improved interpretability, effectively distinguishing various emotional states in speech. The findings suggest that this hybrid approach offers a promising direction for future research in emotion recognition, facilitating more responsive and empathetic AI systems. © 2025 Elsevier B.V., All rights reserved.

Item Type: Conference or Workshop Item (Paper)
Subjects: Computer Science > Artificial Intelligence
Divisions: Engineering and Technology > Aarupadai Veedu Institute of Technology, Chennai > Computer Science Engineering
Depositing User: Unnamed user with email techsupport@mosys.org
Last Modified: 27 Nov 2025 07:10
URI: https://vmuir.mosys.org/id/eprint/2103

Actions (login required)

View Item
View Item