You are here

Proc Natl Acad Sci U S A DOI:10.1073/pnas.0308531101

Metagenes and molecular pattern discovery using matrix factorization.

Publication TypeJournal Article
Year of Publication2004
AuthorsBrunet, J-P, Tamayo, P, Golub, TR, Mesirov, JP
JournalProc Natl Acad Sci U S A
Date Published2004 Mar 23
KeywordsAlgorithms, Central Nervous System Neoplasms, Computational Biology, Data Interpretation, Statistical, Leukemia, Medulloblastoma, Models, Genetic, Neoplasms

We describe here the use of nonnegative matrix factorization (NMF), an algorithm based on decomposition by parts that can reduce the dimension of expression data from thousands of genes to a handful of metagenes. Coupled with a model selection mechanism, adapted to work for any stochastic clustering algorithm, NMF is an efficient method for identification of distinct molecular patterns and provides a powerful method for class discovery. We demonstrate the ability of NMF to recover meaningful biological information from cancer-related microarray data. NMF appears to have advantages over other methods such as hierarchical clustering or self-organizing maps. We found it less sensitive to a priori selection of genes or initial conditions and able to detect alternative or context-dependent patterns of gene expression in complex biological systems. This ability, similar to semantic polysemy in text, provides a general method for robust molecular pattern discovery.


Alternate JournalProc. Natl. Acad. Sci. U.S.A.
PubMed ID15016911
PubMed Central IDPMC384712