| Download | - View accepted manuscript: A probabilistic model for fast and confident categorisation of textual documents (PDF, 325 KiB)
|
|---|
| Author | Search for: Goutte, Cyril1 |
|---|
| Affiliation | - National Research Council of Canada. NRC Institute for Information Technology
|
|---|
| Format | Text, Book Chapter |
|---|
| Abstract | We describe the National Research Council's (NRC) entry in the Anomaly Detection/Text Mining competition organized at the Text Mining Workshop 2007. This entry relies on a straightforward implementation of a probabilistic categorizer described earlier [GGPC02]. This categorizer is adapted to handle multiple labeling and a piecewise-linear confidence estimation layer is added to provide an estimate of the labeling confidence. This technique achieves a score of 1.689 on the test data. This model has potentially useful features and extensions such as the use of a category-specific decision layer or the extraction of descriptive category keywords from the probabilistic profile. |
|---|
| Publication date | 2008 |
|---|
| Publisher | Springer |
|---|
| Place | Oxford |
|---|
| In | |
|---|
| Language | English |
|---|
| NRC number | NRCC 49829 |
|---|
| NPARC number | 5764844 |
|---|
| Export citation | Export as RIS |
|---|
| Report a correction | Report a correction (opens in a new tab) |
|---|
| Record identifier | 05e3038a-f734-4b14-bcc4-d90f41df31e8 |
|---|
| Record created | 2009-03-29 |
|---|
| Record modified | 2024-02-05 |
|---|