Download | - View accepted manuscript: Multi-objective Evolutionary Optimization for Visual Data Mining with Virtual Reality Spaces: Application to Alzheimer Gene Expressions (PDF, 423 KiB)
|
---|
Author | Search for: Valdés, Julio; Search for: Barton, Alan |
---|
Format | Text, Article |
---|
Conference | Genetic and Evolutionary Computation Conference (GECCO) (a recombination of the 15th International Conference on Genetic Algorithms (ICGA) and the 11th Genetic Programming Conference (GP)), July 8-12, 2006, Seattle, Washington, USA |
---|
Subject | visual data mining; virtual reality spaces; multi-objective optimization; genetic algorithms; NSGA-II algorithm; k-nn classification; cross-validation error; similarity structure preservation; non-linear mapping; Sammon error; Alzheimer disease; genomics |
---|
Abstract | This paper introduces a multi-objective optimization approach to the problem of computing virtual reality spaces for the visual representation of relational structures (e.g. databases), symbolic knowledge and others, in the context of visual data mining and knowledge discovery. Procedures based on evolutionary computation are discussed. In particular, the NSGA-II algorithm is used as a framework for an instance of this methodology; simultaneously minimizing Sammon's error for dissimilarity measures, and mean cross-validation error on a k-nn pattern classifier. The proposed approach is illustrated with an example from genomics (in particular, Alzheimer's disease) by constructing virtual reality spaces resulting from multi-objective optimization. Selected solutions along the Pareto front approximation are used as nonlinearly transformed features for new spaces that compromise similarity structure preservation (from an unsupervised perspective) and class separability (froma supervised pattern recognition perspective), simultaneously. The possibility of spanning a range of solutions between these two important goals, is a benefit for the knowledge discovery and data understanding process. The quality of the set of discovered solutions is superior to the ones obtained separately, from the point of view of visual data mining. |
---|
Publication date | 2006 |
---|
In | |
---|
Language | English |
---|
NRC number | NRCC 48506 |
---|
NPARC number | 5765562 |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | b2353854-74ac-4e95-86cd-8daffbab4b91 |
---|
Record created | 2009-03-29 |
---|
Record modified | 2020-10-09 |
---|