Téléchargement | - Voir le manuscrit accepté : Sentiment analysis of short informal texts (PDF, 930 Kio)
|
---|
DOI | Trouver le DOI : https://doi.org/10.1613/jair.4272 |
---|
Auteur | Rechercher : Kiritchenko, Svetlana1; Rechercher : Zhu, Xiaodan1; Rechercher : Mohammad, Saif M.1 |
---|
Affiliation | - Conseil national de recherches du Canada. Technologies de l'information et des communications
|
---|
Format | Texte, Article |
---|
Sujet | Classification (of information); Semantics; Text processing; Ablation experiments; Automatically generated; Percentage points; Sentiment analysis; Sentiment features; Sentiment lexicons; State-of-the-art |
---|
Résumé | We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words. The system ranked first in the SemEval-2013 shared task `Sentiment Analysis in Twitter' (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points. |
---|
Date de publication | 2014-08-01 |
---|
Dans | |
---|
Langue | anglais |
---|
Publications évaluées par des pairs | Oui |
---|
Numéro NPARC | 21275945 |
---|
Exporter la notice | Exporter en format RIS |
---|
Signaler une correction | Signaler une correction (s'ouvre dans un nouvel onglet) |
---|
Identificateur de l’enregistrement | f3c48029-99e0-48c7-9aaf-271e9715465b |
---|
Enregistrement créé | 2015-08-12 |
---|
Enregistrement modifié | 2020-06-04 |
---|