Abstract | Military organizations have to deal with an increasing number of documents coming from different sources and in various formats (paper, fax, e-mail messages, electronic documents). These documents have to be screened, analyzed and categorized in order to interpret their content and gain situation awareness. These documents should be categorized according to their content to enable efficient storage and retrieval. In this context, intelligent techniques and tools should be provided to support this information management process that is currently partly manual. Integrating the recently acquired knowledge in different fields in a system for analyzing, diagnosing, filtering, classifying and clustering documents with a limited human intervention would improve efficiently the quality of information management with reduced human resources. A better categorization and management of information would facilitate correlation of information from different sources, avoid information redundancy, improve access to relevant information, and thus better support decision-making processes. The RDDC-Valcartier's ADAC system (Automatic Documents Analyzer and Classifier) incorporates several techniques and tools for document summarizing and semantic analysis based on ontology of a certain domain (e.g. terrorism), and algorithms of diagnostic, classification and clustering. In this paper, we describe the architecture of the system and the techniques and tools used at each step of the document processing. For the first prototype implementation, the focus has been concentrated on the terrorism domain to develop document corpus and related ontology. |
---|