Abstract | This paper describes a three-year project at the National Research Council of Canada aimed at developing software to assist Indigenous communities in their efforts to preserve their languages and extend their use. The project aimed to work within the empowerment paradigm, where the linguistic goals of communities have at least equal weight with those of the researchers, and where collaboration with communities is central. Because many of the technological directions we took were in response to community needs, the project ended up as a collection of diverse subprojects, including the creation of a sophisticated framework for building verb conjugators for highly inflectional polysynthetic languages (a verb conjugator for Kanyen’kéha, in the Iroquoian language family, was built in the framework), release of what is probably the largest available corpus of sentences in a polysynthetic language (Inuktut) aligned with English sentences and experiments with machine translation (MT) systems trained on this corpus, free online services based on automatic speech recognition (ASR) for easing the transcription bottleneck for recordings of speech in Indigenous languages (and other languages), limited-domain text-to-speech synthesis for some Indigenous languages, and several other subprojects. |
---|