Download | - View accepted manuscript: Tightly Packed Tries: How to Fit Large Models into Memory, and Make them Load Fast, Too (PDF, 569 KiB)
|
---|
Author | Search for: Germann, Ulrich1; Search for: Joanis, Eric1; Search for: Larkin, Samuel1 |
---|
Affiliation | - National Research Council of Canada. NRC Institute for Information Technology
|
---|
Format | Text, Article |
---|
Conference | Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009), Boulder, CO, USA, June 05, 2009 |
---|
Abstract | We present Tightly Packed Tries (TPTs), a compact implementation of read-only, compressed trie structures with fast on-demand paging and short load times. We demonstrate the benefits of TPTs for storing n-gram back-off language models and phrase tables for statistical machine translation. Encoded as TPTs, these databases require less space than flat text file representations of the same data compressed with the gzip utility. At the same time, they can be mapped into memory quickly and be searched directly in time linear in the length of the key, without the need to decompress the entire file. The overhead for local decompression during search is marginal. |
---|
Publication date | 2009-06-05 |
---|
In | |
---|
Language | English |
---|
Peer reviewed | Yes |
---|
NRC number | NRCC 52533 |
---|
NPARC number | 16435915 |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | 9eb37696-ddab-4265-9f10-e5ff2f83779a |
---|
Record created | 2010-11-24 |
---|
Record modified | 2020-04-16 |
---|