ACA: the Translation Bureau’s Assistant Client Advisor

Alternative title	ACA: the Translation Bureau’s Assistant Client Advisor. Final report on the domain prediction management environment
Download	View final version: ACA: the Translation Bureau’s Assistant Client Advisor (PDF, 603 KiB)
DOI	Resolve DOI: https://doi.org/10.4224/40003459
Author	Search for: Simard, Michel¹ORCID identifier: https://orcid.org/0009-0002-5317-3063; Search for: Bernier-Colborne, Gabriel¹; Search for: Goutte, Cyril¹ORCID identifier: https://orcid.org/0000-0003-4939-6555; Search for: Léger, Serge¹; Search for: Tessier, Marc¹ORCID identifier: https://orcid.org/0009-0009-1413-2892
Affiliation	National Research Council Canada. Digital Technologies
Format	Text, Technical Report
Physical description	19 p.
Subject	text domain prediction; translation; natural language processing
Abstract	This document reports on project "Document Workflow (``ACA)", part of the BtB-NRC Collaboration Agreement 2021-22, titled "Artificial Intelligence for Translation Quality" (AI4TQ). This project is about building computer support tools for the Translation Bureau's client advisors, more specifically to identify the specialty domain of texts submitted for translation. In a previous project, we used Bureau data to create classifiers that can identify to which domain a given document belongs, with an accuracy close to 80%. We delivered a system to the Bureau, called the "Assistant Client advisor" (ACA), which provides document classification as a web service, accessible both through a web-based user interface (UI) and an Application Programming Interface (API). In this project, we have greatly expanded this system, by developing a set of functionalities that allow creating, updating, evaluating and deploying domain predictors. The API and UI of this new system will allow the Bureau to create and maintain domain predictors themselves. In addition, we have experimented with approaches to improve prediction accuracy, most notably through neural networks. The new API allows creating and using predictors based on the FastText neural network technology, in addition to the algorithms previously available, SVM and ProbCat. In a series of experiments on Confidence Estimation, we have analyzed the performance of the classifiers, and the relationship between classification accuracy and some numerical indicators produced by classifiers, with the goal of distinguishing between documents that can be handled automatically and documents that should be verified by a client advisor, with the goal of minimizing domain prediction errors and human workload. Finally, we have added functionalities to segment large documents into smaller pieces, based on the predicted domain of individual segments of text.
Publication date	2023-03-31
Publisher	National Research Council of Canada. Digital Technologies Research Centre
Language	English
Peer reviewed	No
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	ef3135f0-2112-4227-980c-844bc7462d1b
Record created	2025-02-24
Record modified	2026-03-18

Page details

From:

National Research Council Canada

Date modified:: 2026-04-20