Context- and Knowledge-Aware Graph Convolutional Network for Multimodal Emotion Recognition | IEEE Journals & Magazine | IEEE Xplore

Context- and Knowledge-Aware Graph Convolutional Network for Multimodal Emotion Recognition


Abstract:

This work proposes an approach for emotion recognition in conversation that leverages context modeling, knowledge enrichment, and multimodal (text and audio) learning bas...Show More

Abstract:

This work proposes an approach for emotion recognition in conversation that leverages context modeling, knowledge enrichment, and multimodal (text and audio) learning based on a graph convolutional network (GCN). We first construct two distinctive graphs for modeling the contextual interaction and knowledge dynamic. We then introduce an affective lexicon into knowledge graph building to enrich the emotional polarity of each concept, that is the related knowledge of each token in an utterance. Then, we achieve a balance between the context and the affect-enriched knowledge by incorporating them into the new adjacency matrix construction of the GCN architecture, and teach them jointly with multiple modalities to effectively structure the semantics-sensitive and knowledge-sensitive contextual dependence of each conversation. Our model outperforms the state-of-the-art benchmarks by over 22.6% and 11% relative error reduction in terms of weighted-F1 on the IEMOCAP and MELD databases, respectively, demonstrating the superiority of our method in emotion recognition.
Published in: IEEE MultiMedia ( Volume: 29, Issue: 3, 01 July-Sept. 2022)
Page(s): 91 - 100
Date of Publication: 10 May 2022

ISSN Information:

Funding Agency:


Emotion recognition in conversations (ERC) has attracted increasing attention because it is a necessary step for a number of applications, including social media threads (such as YouTube, Facebook, Twitter), human–computer interaction, and so on. Different from nonconversation cases, “context” is a vital component of ERC, which represents the previous dialog content of a target utterance. The intention and emotion of a target utterance are mostly affected by the surrounding context, as we can see from conversations in Figure 1. Therefore, it is important but challenging to effectively model the contextual dependence within conversations.

Contact IEEE to Subscribe

References

References is not available for this document.