This title appears in the Scientific Report :
2019
Please use the identifier:
http://hdl.handle.net/2128/21854 in citations.
Rank Selection in Non-negative Matrix Factorization: systematic comparison and a new MAD metric
Rank Selection in Non-negative Matrix Factorization: systematic comparison and a new MAD metric
Abstract—Non-Negative Matrix Factorization (NMF) is a powerful dimensionality reduction and factorization method that provides a part-based representation of the data. In the absence of a priori knowledge about the latent dimensionality of the data, it is necessary to select a rank of the reduced re...
Saved in:
Personal Name(s): | Muzzarelli, Laura (Corresponding author) |
---|---|
Weis, Susanne / Eickhoff, Simon / Patil, Kaustubh | |
Contributing Institute: |
Gehirn & Verhalten; INM-7 |
Imprint: |
2019
|
Physical Description: |
7 |
Conference: | 2019 International Joint Conference on Neural Networks, Budapest (Hungary), 2019-07-14 - 2019-07-19 |
Document Type: |
Contribution to a conference proceedings |
Research Program: |
Human Brain Project Specific Grant Agreement 2 Supercomputing and Modelling for the Human Brain Theory, modelling and simulation |
Link: |
OpenAccess OpenAccess |
Publikationsportal JuSER |
Abstract—Non-Negative Matrix Factorization (NMF) is a powerful dimensionality reduction and factorization method that provides a part-based representation of the data. In the absence of a priori knowledge about the latent dimensionality of the data, it is necessary to select a rank of the reduced representation. Several rank selection methods have been proposed, but no consensus exists on when a method is suitable to use. In this work, we propose a new metric for rank selection based on imputation cross-validation, and we systematically compare it against six other metrics while assessing the effects of data properties. Using synthetic datasets with different properties, our work critically evidences that most methods fail to identify the true rank. We show that properties of the data heavily impact the ability of different methods. Imputation-based metrics, including our new MADimput, provided the best accuracy irrespective of the data type, but no solution worked perfectly in all circumstances. One should therefore carefully assess characteristics of their dataset in order to identify the most suitable metric for rank selection. Keywords— non-negative matrix factorization, rank selection, cross-validation. |