Multiclass nonnegative matrix factorization for comprehensive feature pattern discovery

From National Research Council Canada

DOI	Resolve DOI: https://doi.org/10.1109/TNNLS.2018.2849932
Author	Search for: Li, Yifeng¹; Search for: Pan, Youlian¹; Search for: Liu, Ziying¹
Affiliation	National Research Council of Canada. Digital Technologies
Format	Text, Article
Subject	big data; cancer; feature pattern discovery; multiclass nonnegative matrix factorization (MC-NMF); stability selection
Abstract	In this big data era, interpretable machine learning models are strongly demanded for the comprehensive analytics of large-scale multiclass data. Characterizing all features from such data is a key but challenging step to understand the complexity. However, existing feature selection methods do not meet this need. In this paper, to address this problem, we propose a Bayesian multiclass nonnegative matrix factorization (MC-NMF) model with structured sparsity that is able to discover ubiquitous and class-specific features. Variational update rules were derived for efficient decomposition. In order to relieve the need of model selection and stably describe feature patterns, we further propose MC-NMF with stability selection, an ensemble method that collectively detects feature patterns from many runs of MC-NMF using different hyperparameter values and training subsets. We assessed our models on both simulated count data and multitumor ribonucleic acid-seq data. The experiments revealed that our models were able to recover predefined feature patterns from the simulated data and identify biologically meaningful patterns from the pan-cancer data.
Publication date	2018-07-16
Publisher	IEEE
In	IEEE Transactions on Neural Networks and Learning Systems 30, no. 2: 615–629.
Language	English
Peer reviewed	Yes
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	86a509b0-de54-4406-a3eb-a381708d6877
Record created	2019-06-06
Record modified	2020-03-16

Date modified:: 2024-12-26