GECCO Workshop on Medical Applications of Genetic and Evolutionary Computation, July 7-11, 2007.
Two medical data sets (Breast cancer and Colon cancer) are investigated within a visual data mining paradigm through the unsupervised construction of virtual reality spaces using genetic programming and classical optimization (for comparison purposes). The desired visual spaces are such that a modified genetic programming approach was proposed in order to generate programs representing vector functions. The extension leads to populations that are composed of forests, instead of single expression trees. No particular kind of genetic programming algorithm is required due to the generic nature of the approach taken in the paper. The results (visual spaces) show that the relationships between the data objects and their classes can be appreciated in all of the obtained spaces regardless of the mapping error. In addition, the spaces obtained with genetic programming resulted in lower mapping errors than a classical optimizer and produced relatively simple equations. Further, the set of obtained equations can be statistically analyzed in terms of the original attributes in order to further the understanding of the derivation of the new nonlinear features that are constructed. Thus, explicit mappings provided by genetic programming can be used for feature selection and generation in data mining where scalar and/or vector functions are involved.
The Genetic and Evolutionary Computation Conference (GECCO-2007) (2007).