National Research Council of Canada. Digital Technologies
2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), October 13-16, 2016, Omaha, NE, USA
The automatic detection of emotions in Twitter posts is a challenging task due to the informal nature of the language used in this platform. In this paper, we propose a methodology for expanding the NRC word-emotion association lexicon for the language used in Twitter. We perform this expansion using multi-label classification of words and compare different word-level features extracted from unlabelled tweets such as unigrams, Brown clusters, POS tags, and word2vec embeddings. The results show that the expanded lexicon achieves major improvements over the original lexicon when classifying tweets into emotional categories. In contrast to previous work, our methodology does not depend on tweets annotated with emotional hashtags, thus enabling the identification of emotional words from any domain-specific collection using unlabelled tweets.
2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI): 536–539.