Understanding emotions: a dataset of tweets to study interactions between affect categories

From National Research Council Canada

Download	View final version: Understanding emotions: a dataset of tweets to study interactions between affect categories (PDF, 688 KiB)
Author	Search for: Mohammad, Saif M.¹; Search for: Kiritchenko, Svetlana¹
Affiliation	National Research Council of Canada. Digital Technologies
Format	Text, Article
Conference	LREC 2018, Eleventh International Conference on Language Resources and Evaluation, May 7-12,2018, Miyazaki, Japan
Subject	emotion intensity; valence; arousal; dominance; basic emotions; crowdsourcing; sentiment analysis
Abstract	Human emotions are complex and nuanced. Yet, an overwhelming majority of the work in automatically detecting emotions from text has focused only on classifying text into positive, negative, and neutral classes, and a much smaller amount on classifying text into basic emotion categories such as joy, sadness, and fear. Our goal is to create a single textual dataset that is annotated for many emotion (or affect) dimensions (from both the basic emotion model and the VAD model). For each emotion dimension, we annotate the data for not just coarse classes (such as anger or no anger) but also for fine-grained real-valued scores indicating the intensity of emotion (anger, sadness, valence, etc.). We use Best–Worst Scaling (BWS) to address the limitations of traditional rating scale methods such as inter-and intra-annotator inconsistency by employing comparative annotations. We show that the fine-grained intensity scores thus obtained are reliable (repeat annotations lead to similar scores). We choose Twitter as the source of the textual data we annotate because tweets are self-contained, widely used, public posts, and tend to be rich in emotions. The new dataset is useful for training and testing supervised machine learning algorithms for multi-label emotion classification, emotion intensity regression, detecting valence, detecting ordinal class of intensity of emotion (slightly sad, very angry, etc.), and detecting ordinal class of valence (or sentiment). We make the data available for the recent SemEval-2018 Task 1: Affect in Tweets, which explores these five tasks. The dataset also sheds light on crucial research questions such as: which emotions often present together in tweets?; how do the intensities of the three negative emotions relate to each other?; and how do the intensities of the basic emotions relate to valence?
Publication date	2018-05-12
Publisher	European Languages Resources Association
In	Eleventh International Conference on Language Resources and Evaluation: 198–209.
Language	English
Peer reviewed	Yes
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	2c21fb84-2b97-44b0-a698-4f4e93d8ee93
Record created	2019-06-12
Record modified	2020-05-30

Date modified:: 2024-04-20