Home / Papers / RL-EMO: A Reinforcement Learning Framework for Multimodal Emotion Recognition

RL-EMO: A Reinforcement Learning Framework for Multimodal Emotion Recognition

DOI: 10.1109/ICASSP48485.2024.10446459Semantic Scholar

2 Citations•2024•

Chengwen Zhang, Yuhao Zhang, Bo Cheng

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

A novel Reinforcement Learning framework for the multimodal EMOtion recognition task (RL-EMO), which combines a Multi-modal Graph Convolution Network (MMGCN) module with a novel Reinforcement Learning (RL) module to model context at both the semantic and emotional levels respectively.

Abstract

Multimodal Emotion Recognition in Conversation (ERC) has gained significant attention due to its wide-ranging applications in diverse areas. However, most previous approaches focused on modeling context at the semantic level, neglecting the context of dependency information at the emotional level. In this paper, we proposed a novel Reinforcement Learning framework for the multimodal EMOtion recognition task (RL-EMO), which combines a Multi-modal Graph Convolution Network (MMGCN) [1] module with a novel Reinforcement Learning (RL) module to model context at both the semantic and emotional levels respectively. The RL-EMO approach was evaluated on two widely used multi-modal datasets, IEMOCAP and MELD, and the results show that RL-EMO outperforms several baseline models, achieving significant improvements in F1-score. We release the code at https://github.com/zyh9929/RL-EMO.