| Téléchargement | - Voir la version finale : Transfer learning improves french cross-domain dialect identification: NRC @ VarDial 2022 (PDF, 319 Kio)
|
|---|
| Auteur | Rechercher : Bernier-Colborne, Gabriel1; Rechercher : Leger, Serge1; Rechercher : Goutte, Cyril1 |
|---|
| Affiliation | - Conseil national de recherches Canada. Technologies numériques
|
|---|
| Format | Texte, Article |
|---|
| Conférence | Ninth Workshop on NLP for Similar Languages, Varieties and Dialects, October 2022, Gyeongju, Republic of Korea |
|---|
| Résumé | We describe the systems developed by the National Research Council Canada for the French Cross-Domain Dialect Identification shared task at the 2022 VarDial evaluation campaign. We evaluated two different approaches to this task: SVM and probabilistic classifiers exploiting n-grams as features, and trained from scratch on the data provided; and a pre-trained French language model, CamemBERT, that we fine-tuned on the dialect identification task. The latter method turned out to improve the macro-F1 score on the test set from 0.344 to 0.430 (25% increase), which indicates that transfer learning can be helpful for dialect identification. |
|---|
| Date de publication | 2022-10-06 |
|---|
| Maison d’édition | Association for Computational Linguistics |
|---|
| Licence | |
|---|
| Dans | |
|---|
| Langue | anglais |
|---|
| Publications évaluées par des pairs | Oui |
|---|
| Exporter la notice | Exporter en format RIS |
|---|
| Signaler une correction | Signaler une correction (s'ouvre dans un nouvel onglet) |
|---|
| Identificateur de l’enregistrement | 7d0c4e22-ed47-4519-a0d3-0f1c1b25b516 |
|---|
| Enregistrement créé | 2022-10-19 |
|---|
| Enregistrement modifié | 2022-10-21 |
|---|