Collection DetailsC2 collection detailsGene sets in this collection come from such sources as:
C2: CP collection detailsThe pathway gene sets are curated from the following online databases:
C4: CGN collection detailsThis collection is identical to that previously reported in (Subramanian, Tamayo et al. 2005).Starting with a curated list of 380 cancer-associated genes (Brentani, Caballero et al. 2003, Proc. Natl. Acad. Sci. USA 100, 13418-13423), the authors (Subramanian, Tamayo et al. 2005) mined 4 expression compendia datasets for correlated gene sets. Gene neighborhoods with <25 genes at a Pearson correlation threshold of 0.8 were omitted yielding 427 sets.
C5 collection detailsGene sets in this collection are derived from the controlled vocabulary of the Gene Ontology (GO) project: The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology. Nature Genet. (2000) 25: 25-29 (www.geneontology.org). The gene sets are based on GO terms (gene_ontology_edit.obo, downloaded 1/25/2008) and their associations to human genes (gene2go, downloaded 1/22/2008).Each GO term belongs to one of the three ontologies: molecular function (MF), cellular component (CC) or biological process (BP). A gene product might be associated with or located in one or more cellular components. It is active in one or more biological processes, during which it performs one or more molecular functions. Each ontology captures a unique aspect of the gene product. A GO annotation consists of a GO term associated with a specific reference that describes the work or analysis upon which the association between a specific GO term and gene product is based. Each annotation must also include an evidence code to indicate how the annotation to a particular term is supported (www.geneontology.org/GO.evidence.shtml). Only associations with the following evidence codes are included in MSigDB gene sets: IDA IPI, IMP IGI, IEP ISS, TAS. GO gene sets for very broad categories, such as Biological Process, have been omitted from MSigDB. GO gene sets with fewer than 10 genes have also been omitted. Gene sets with the same members have been resolved based on the GO tree structure: if a parent term has only one child term and their gene sets have the same members, the child gene set is omitted; if the gene sets of sibling terms have the same members, the sibling gene sets are omitted. |
| Broad Home | Cancer Genomics |
MSigDB database v3.0 updated Sep 9, 2010
GSEA/MSigDB web site v3.82 released Oct 7, 2011 |
