Résumé | In differential gene expression data analysis, one objective is to identify groups of co-expressed genes from a large dataset in order to detect the association between such a group of genes and an experimental condition. This is often done through a clustering approach, such as k-means or bipartition hierarchical clustering, based on particular similarity measures in the grouping process. In such a dataset, the gene differential expression itself is an innate attribute that can be used in the feature extraction process. For example, in a dataset consisting of multiple treatments versus their controls, the expression of a gene in each treatment would have three possible behaviors, upregulated, downregulated, or unchanged. We present in this chapter, a differential expression feature extraction (DEFE) method by using a string consisting of three numerical values at each character to denote such behavior, i.e., 1 ¼ up, 2 ¼ down, and 0 ¼ unchanged, which results in up to 3B differential expression patterns across all B comparisons. This approach has been successfully applied in many research projects, and among these, we demonstrate the strength of DEFE in a case study on RNA-sequencing (RNA-seq) data analysis of wheat challenged with the phytopathogenic fungus, Fusarium graminearum. Combinations of multiple schemes of DEFE patterns revealed groups of genes putatively associated with resistance or susceptibility to FHB |
---|