National Research Council of Canada. NRC Institute for Information Technology
Recent Advances in Natural Language Processing (RANLP), Modern Approaches in Translation Technologies Workshop, September 2005, Borovets, Bulgaria
statistical machine translation; lexical resources; keyphrase list
This paper presents a novel strategy for translating lists of keyphrases. Typical keyphrase lists appear in scientific articles, information retrieval systems and web page meta-data. Our system combines a statistical translation model trained on a bilingual corpus of scientific papers with sense-focused look-up in a large bilingual terminological resource. For the latter, we developed a novel technique that benefits from viewing the keyphrase list as contextual help for sense disambiguation. The optimal combination of modules was discovered by a genetic algorithm. Our work applies to the French / English language pair.
Recent Advances in Natural Language Processing (RANLP), Modern Approaches in Translation Technologies Workshop 2005 [Proceedings].