Home / Papers / Sentiment Analysis and Topic Recognition in Video Transcriptions

Sentiment Analysis and Topic Recognition in Video Transcriptions

DOI: 10.1109/mis.2021.3062200Source

121 Citations•2021•

Lukas Stappen, Alice Baird, Erik Cambria

This article uses SenticNet to extract natural language concepts and fine-tune several feature types on a subset of MuSe-CAR to explore the content of a video as well as learning to predict emotional valence, arousal, and speaker topic classes.

Abstract

Nowadays, videos are an integral modality for information sharing on the World Wide Web. However, systems able to automatically understand the content and sentiment of a video are still in their infancy. Linguistic information transported in spoken parts of a video is known to convey valuable properties in regards to context and emotions. In this article, we explore a lexical knowledge-based extraction approach to obtain such understanding from the video transcriptions of a large-scale multimodal dataset (MuSe-CAR). To this end, we use SenticNet to extract natural language concepts and fine-tune several feature types on a subset of MuSe-CAR. With these features, we explore the content of a video as well as learning to predict emotional valence, arousal, and speaker topic classes. Our best model improves the linguistic baseline from the MuSe-Topic 2020 subchallenge by almost 3% (absolute) for the prediction of valence on the predefined challenge metric and outperforms a variety of baseline systems that require much higher computational power than the one proposed herein.