Pruning relations for substructure discovery of multi-relational databases

From National Research Council Canada

Download	View accepted manuscript: Pruning relations for substructure discovery of multi-relational databases (PDF, 367 KiB)
Author	Search for: Guo, H.; Search for: Viktor, H.L.; Search for: Paquet, Eric
Format	Text, Article
Conference	The 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), September 17-21, 2007, Warsaw, Poland
Abstract	Multi-relational data mining methods discover patterns across multiple interlinked tables (relations) in a relational database. In many large organizations, such a multi-relational database spans numerous departments and/or subdivisions, which are involved in different aspects of the enterprise such as customer profiling, fraud detection, inventory management, financial management, and so on. When considering multi-relational classification, it follows that these subdivisions will express different interests in the data, leading to the need to explore various subsets of relevant relations with high utility with respect to the target class. The paper presents a novel approach for pruning the uninteresting relations of a relational database where relations come from such different parties and spans many classification tasks. We aim to create a pruned structure and thus minimize predictive performance loss on the final classification model. Our method identifies a set of strongly uncorrelated subgraphs to use for training and discards all others. The experiments performed demonstrate that our strategy is able to significantly reduce the size of the relational schema without sacrificing predictive accuracy.
Publication date	2007
In	The 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD) [Proceedings].
Language	English
NRC number	NRCC 49827
NPARC number	5763498
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	d741ee77-f3c0-4d76-a7d6-2d9757d8602a
Record created	2009-03-29
Record modified	2020-08-12

Date modified:: 2024-12-22