Skip to main content

Speech Emotion Recognition

Introduction

Speech Emotion Recognition (SER) is a subfield of speech processing and affective computing that aims to identify human emotions from speech signals. With the rapid development of deep learning, SER has made significant progress in creating applications that can bring many benefits to life. Some applications of SER include mental health monitoring, customer service analysis, human-computer interaction, and voice assistants enhancement.

Our research group focuses on exploiting machine learning and deep learning techniques, incorporating with acoustic features and linguistic knowledge to develop high performance SER systems. We also investigate methods to construct emotional speech datasets and to create robust models for cross-corpus and cross-language emotion recognition.

Contact: Dr. Nguyen Thi Thu Trang | ✉️ trangntt@soict.hust.edu.vn

Research Directions

  • Multimodal Emotion Recognition: Combining speech, text, and facial expressions to improve emotion recognition accuracy. We investigate fusion techniques and attention mechanisms to leverage complementary information from multiple modalities.
  • Cross-corpus Emotion Recognition: Developing robust models that can generalize across different datasets and recording conditions. We address domain adaptation and transfer learning challenges in SER.
  • Real-time Emotion Detection: Building efficient models for real-time emotion detection in streaming audio. We focus on lightweight architectures and optimization techniques for deployment on edge devices.

Team Members

Tran Long Vu

Tran Long Vu

Team Leader

Nguyen Quang Vinh

Nguyen Quang Vinh

Researcher

Do Duc Long

Do Duc Long

Researcher

Nguyen Minh Ngoc

Nguyen Minh Ngoc

Researcher

Pham Le Minh Quang

Pham Le Minh Quang

Researcher

Nguyen Trung Hieu

Nguyen Trung Hieu

Researcher

Latest Publications

  1. P. V. Thanh, N. T. T. Huyen, P. N. Quan, N. T. T. Trang. A Robust Pitch-Fusion Model for Speech Emotion Recognition in Tonal Languages. ICASSP 2024. 12386–12390. Seoul, Korea. 19/04/2024