Author | Search for: Carpuat, Marine1; Search for: Daume III, Hal; Search for: Henry, Katie; Search for: Irvine, Ann; Search for: Jagarlamudi, Jagadeesh; Search for: Rudinger, Rachel |
---|
Affiliation | - National Research Council of Canada. Information and Communication Technologies
|
---|
Format | Text, Article |
---|
Conference | 51st Annual Meeting of the Association for Computational Linguistics, August 4-9 2013, Sofia, Bulgaria |
---|
Abstract | Words often gain new senses in new domains. Being able to automatically identify, from a corpus of monolingual text, which word tokens are being used in a previously unseen sense has applications to machine translation and other tasks sensitive to lexical semantics. We define a task, SenseSpotting, in which we build systems to spot tokens that have new senses in new domain text. Instead of difficult and expensive annotation, we build a goldstandard by leveraging cheaply available parallel corpora, targeting our approach to the problem of domain adaptation for machine translation. Our system is able to achieve F-measures of as much as 80%, when applied to word types it has never seen before. Our approach is based on a large set of novel features that capture varied aspects of how words change when used in new domains. |
---|
Publication date | 2013 |
---|
Publisher | Association for Computational Linguistics |
---|
In | |
---|
Language | English |
---|
Peer reviewed | Yes |
---|
NPARC number | 23000603 |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | 559b8e7b-80bf-4aec-a2a0-33ddb4572af4 |
---|
Record created | 2016-08-04 |
---|
Record modified | 2020-04-22 |
---|