Loading [MathJax]/extensions/MathMenu.js
Context- and Knowledge-Aware Graph Convolutional Network for Multimodal Emotion Recognition | IEEE Journals & Magazine | IEEE Xplore

Context- and Knowledge-Aware Graph Convolutional Network for Multimodal Emotion Recognition


Abstract:

This work proposes an approach for emotion recognition in conversation that leverages context modeling, knowledge enrichment, and multimodal (text and audio) learning bas...Show More

Abstract:

This work proposes an approach for emotion recognition in conversation that leverages context modeling, knowledge enrichment, and multimodal (text and audio) learning based on a graph convolutional network (GCN). We first construct two distinctive graphs for modeling the contextual interaction and knowledge dynamic. We then introduce an affective lexicon into knowledge graph building to enrich the emotional polarity of each concept, that is the related knowledge of each token in an utterance. Then, we achieve a balance between the context and the affect-enriched knowledge by incorporating them into the new adjacency matrix construction of the GCN architecture, and teach them jointly with multiple modalities to effectively structure the semantics-sensitive and knowledge-sensitive contextual dependence of each conversation. Our model outperforms the state-of-the-art benchmarks by over 22.6% and 11% relative error reduction in terms of weighted-F1 on the IEMOCAP and MELD databases, respectively, demonstrating the superiority of our method in emotion recognition.
Published in: IEEE MultiMedia ( Volume: 29, Issue: 3, 01 July-Sept. 2022)
Page(s): 91 - 100
Date of Publication: 10 May 2022

ISSN Information:

Funding Agency:

References is not available for this document.

Emotion recognition in conversations (ERC) has attracted increasing attention because it is a necessary step for a number of applications, including social media threads (such as YouTube, Facebook, Twitter), human–computer interaction, and so on. Different from nonconversation cases, “context” is a vital component of ERC, which represents the previous dialog content of a target utterance. The intention and emotion of a target utterance are mostly affected by the surrounding context, as we can see from conversations in Figure 1. Therefore, it is important but challenging to effectively model the contextual dependence within conversations.

Select All
1.
S. Poria, E. Cambria, D. Hazarika, N. Majumder, A. Zadeh and L.-P. Morency, "Context-dependent sentiment analysis in user-generated videos", Proc. 55th Annu. Meeting Assoc. Comput. Linguistics, pp. 873-883, 2017.
2.
N. Majumder et al., "Dialoguernn: An attentive RNN for emotion detection in conversations", Proc. AAAI Conf. Artif. Intell., vol. 33, pp. 6818-6825, 2019.
3.
D. Ghosal, N. Majumder, S. Poria and A. Gelbukh, "DialogueGCN: A graph convolutional neural network for emotion recognition in conversation", Proc. Conf. Empirical Methods Natural Lang. Process. 9th Int. Joint Conf. Natural Lang. Process., pp. 154-164, 2019.
4.
C. Busso et al., "IEMOCAP: Interactive emotional dyadic motion capture database", Lang. Resour. Eval., vol. 42, no. 4, pp. 335-359, 2008.
5.
Y. Fu et al., "Consk-GCN: Conversational semantic-and knowledge-oriented graph convolutional network for multimodal emotion recognition", Proc. Int. Conf. Multimedia Expo., pp. 1-6, 2021.
6.
P. Tzirakis, G. Trigeorgis, M. A. Nicolaou, B. W. Schuller and S. Zafeiriou, "End-to-end multimodal emotion recognition using deep neural networks", IEEE J. Sel. Topics Signal Process., vol. 11, no. 8, pp. 1301-1309, Dec. 2017.
7.
N. Li, B. Liu, Z. Han, Y.-S. Liu and J. Fu, "Emotion reinforced visual storytelling", Proc. Int. Conf. Multimedia Retrieval, pp. 297-305, 2019.
8.
T. Mittal, P. Guhan, U. Bhattacharya, R. Chandra, A. Bera and D. Manocha, "Emoticon: Context-aware multimodal emotion recognition using frege’s principle", Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 14234-14243, 2020.
9.
M. Schlichtkrull et al., "Modeling relational data with graph convolutional networks", Proc. Eur. Semantic Web Conf., pp. 593-607, 2018.
10.
S. Tripathi, S. Tripathi and H. Beigi, "Multi-modal emotion recognition on IEMOCAP dataset using deep learning", 2018.
11.
Y. Kim, "Convolutional neural networks for sentence classification", Proc. Conf. Empirical Methods Natural Lang. Process., pp. 1746-1751, 2014.
12.
T. Young, E. Cambria, I. Chaturvedi, H. Zhou, S. Biswas and M. Huang, "Augmenting end-to-end dialogue systems with commonsense knowledge", Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, pp. 4970-4977, 2018.
13.
P. Zhong, D. Wang and C. Miao, "Knowledge-enriched transformer for emotion detection in textual conversations", Proc. Conf. Empirical Methods Natural Lang. Process. 9th Int. Joint Conf. Natural Lang. Process., pp. 165-176, 2019.
14.
R. Speer, J. Chin and C. Havasi, "ConceptNet 5.5: An open multilingual graph of general knowledge", Proc. AAAI Conf. Artif. Intell., pp. 4444-4451, 2017.
15.
S. Mohammad, "Obtaining reliable human ratings of valence arousal and dominance for 20000 english words", Proc. 56th Annu. Meeting Assoc. Comput. Linguistics, pp. 174-184, 2018.
16.
P. Bojanowski, E. Grave, A. Joulin and T. Mikolov, "Enriching word vectors with subword information", Trans. Assoc. Comput. Linguistics, vol. 5, pp. 135-146, 2017.
17.
S. Poria, D. Hazarika, N. Majumder, G. Naik, E. Cambria and R. Mihalcea, "Meld: A multimodal multi-party dataset for emotion recognition in conversations", Proc. 57th Annu. Meeting Assoc. Comput. Linguistics, pp. 527-536, 2019.
18.
L. Guo, L. Wang, J. Dang, L. Zhang and H. Guan, "A feature fusion method based on extreme learning machine for speech emotion recognition", Proc. ICASSP IEEE Int. Conf. Acoust. Speech Signal Process., pp. 2666-2670, 2018.
19.
J. D. M.-W. C. Kenton and L. K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding", pp. 4171-4186, 2018.
20.
C. E. Osgood, "The nature and measurement of meaning", Psychol. Bull., vol. 49, no. 3, pp. 197-237, 1952.

Contact IEEE to Subscribe

References

References is not available for this document.