Cancer Program Data Sets

Broad Institute Genome Data Analysis Center (GDAC)

On behalf of The Cancer Genome Atlas, the Broad Genome Data Analysis Center designs and operates scientific data and analysis pipelines which pump terabyte-scale genomic datasets through scores of quantitative algorithms, in the hope of accelerating the understanding of cancer.

An RNA interference model of RPS19 deficiency in Diamond Blackfan Anemia recapitulates defective hematopoiesis and rescue by dexamethasone: identification of dexamethasone responsive genes by microarray

 Commentary in BloodRPS19 paper commentary, Blood 2005.pdf
 Gene expression data: CEL
 Supplementary Figure S1Figure S1.pdf
 Supplementary Table S1TableS1.xls
 Supplementary Table S2TableS2.xls
 Supplementary Table S3TableS3.xls
 Supplementary Table S4TableS4.xls
 Supplementary Table S5TableS5.xls

Transformation from committed progenitor to leukaemia stem cell initiated by MLL-AF9

 Normal Progenitor And Leukemic SamplesNormals_Leu.gct
 MLL-AF9 Immediate SamplesMLL_AF9.gct
 HSC Signature compared to other normal progenitors.
 Metric:SNR; # significant genes 1334 (FDR<=0.02)HSC_FDR002.gct
 Leukemic GMP Signature. filtering:max-min=80;max/min=2.5
 SNR;592 genes;p-value<=0.01;FDR<=0.023leuGMP.gct
 Self-renewal associated signature.
 Filtering: max-min=80;max/min=2.5Normals_Leu.threshold_FDR01.gct
 Metric:SNR;420 significant probesets; p-value<=0.001, FDR <=0.01
 Self-renewal associated signature - Down.
 Filtering: max-min=80;max/min=2.5300_HSCLeu_dn_0306.gct
 Metric:SNR;302 significant probe sets; p-value<=0.001, FDR<=0.01
 Zip of CEL
 Sample KeySampleKey.xls

Loss-of-heterozygosity analysis of small-cell lung carcinomas using single-nucleotide polymorphism arrays.


Gene expression-based chemical genomics identifies rapamycin as a modulator of MCL-1 and glucocorticoid resistance in leukemia

 ALL Patient SamplesChildren_NE.gct
 Rapamycin treated CEM-C1 cell line(24 hour).Rap24hour_Control.gct
 Rapamycin treated CEM-C1 cell line(3 hour).Rap3hour_control.gct
 Resistant/Sensitive Signatures.
 Resistant Markers p-value <=0.0005Res_p0005.gct
 Sensitive Markers p-value <=0.0005Sens_p0005.gct
 Resistant Markers p-value <=0.001Res_p001.gct
 Sensitive Markers p-value <=0.001Sens_p001.gct
 Zip of all CEL

Lesional gene expression profiling in cutaneous T-cell lymphoma reveals natural clusters associated with disease outcome

 Samples' AnnotationSampleAnnotation.xls
 RMA, un-log2 transformed data ('.res' format)ctl.rma2.res.gz
 Raw '.CEL' files (1st 21 chips)
 Raw '.CEL' files (2nd 21 chips)
 Raw '.CEL' files (last 21 chips)

Expression-based Screening Identifies the Combination of Histone Deacetylase Inhibitors and Retinoids for Neuroblastoma Differentiation

 Plain text file describing available supplementary informationREADME.NB
 Neuroblastoma microarray dataNeuroCellLines_060628_ams.res
 CEL files for neuroblastoma

An erythroid differentiation signature predicts response to lenalidomide in Myelodysplastic Syndrome

 Datasets files and prediction program (R script)
 Sample annotation filejournal.pmed.0050035.st001.xls
 CEL filesrevlimid_files (1).zip

Sanger Cell Line Project

 Sanger Cell Line Project Affymetrix DataSanger_Cell_Line_Project_Affymetrix_QCed_Data_n798.gct
 Sanger Cell Line Project Affymetrix Data InformationSanger_affy_n798_sample_info_published.xls
 Instruction to obtain CEL files for Affymetrix DataHow_to_obtain_CEL_files_for_SCLP_Affymetrix_data.doc
 Sanger Cell Line Project miR Raw
 Sanger Cell Line Project miR Normalized
 Sanger Cell Line Project miR Sample InfoSanger800_miRNA.stl

MicroRNA Dynamics in the Stages of Tumorigenesis Correlate with Hallmark Capabilities of Cancer

 Supplemental Table 1OlsonSupTable1.xls
 Supplemental Table 2OlsonSupTable2.xls
 Supplemental Table 3OlsonSupTable3.doc
 Supplemental Table 4OlsonSupTable4.xls

COT drives resistance to RAF inhibition through MAP kinase pathway reactivation

 DNA copy-number (segfile) of chromosome 10 in cell lines (CCLE)CCLE_Chrom10_MAP3K8_segfile.seg.txt
 DNA copy-number of MAP3K8/COT in cell lines (CCLE)CCLE_MAP3K8_copy-number.txt
 mRNA expression level of MAP3K8/COT in cell lines (CCLE)CCLE_MAP3K8_expressionAffy.txt

GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers


Nearest Template Prediction: A Single-Sample-Based Flexible Class Prediction with Confidence Assessment


Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1.

  B-scores data fileAchillesData19celllines.txt
 B-scores data file descriptionAchillesData19celllines File Description.doc

Integrative Transcriptome Analysis Reveals Common Molecular Subclasses of Human Hepatocellular Carcinoma

 HCC subclass meta-analysis signature genesHoshida_HCC_meta_analysis_3subclass_signature.txt

Identification of AML1-ETO Modulators by Chemical Genomics

 Text file describing supplemental information contentsREADME.AE
 Supplementary Tables and FiguresCorsello_SupplementaryTablesFigures.pdf
 Supplementary MethodsCorsello_SupplementaryMethods.pdf
 Kasumi AML1-ETO Knockdown DataKasumi_AML1-ETO_complete_200410.res
 U937 AML1-ETO Induced DataU937_AMLeto_inducible_ams.res
 Kasumi AML1-ETO Knockdown Data CEL
 U937 AML1-ETO Induced Data CEL

High-resolution mapping of copy-number alterations with massively parallel sequencing

 Alignment positions of sequence reads (hg18)arachne_qltout_marks.tar.gz
 Matlab files with alignable coordinateshg18_alignable_N36_D2.tar.gz
 Matlab source code, SegSeq version 1.0.1SegSeq_1.0.1.tar.gz

microRNA-mediated control of cell fate in megakaryocyte-erythrocyte progenitors

 Supplemental Data in pdfLu_Supplemental_Data_dev_cell.pdf
 Table S1Supplementary_Table1.xls
 Table S2Supplementary_Table2.xls
 Table S3Supplementary_Table3.xls
 Table S4, normalized expression datamegamiR_data.normalized.log2.th6.gct

Identification of RPS14 as a 5q- syndrome gene by RNA interference screen

 Sample information and data matrix (Excel)5q_shRNA_affy.xls
 GCT gene expression dataset5q_GCT_file.gct
 RES gene expression dataset5q_GCT_file.res
 CEL files

Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma

 Supplemental InformationGISTIC_Supplement_071020.pdf
 Segmented Datasegmented_data_080520.seg
 Array List File for GISTICGlioma_array_list_080423.txt
 Copy-number Polymorphisms (100K SNP only)100K_CNVs_080423.txt
 Marker Positions100K_markerpositions.hg16.txt
 Sample information (txt format)Sample_info_070424.txt
 Array List File for GISTICPreprocessingGliomas_normals_array_list_080522.txt
 GISTICPreprocessing for 64-bit LinuxPREPROCESSING.tar.gz
 GISTIC for 64-bit LinuxGISTIC_0_9_2.tar.gz

Characterizing the cancer genome in lung adenocarcinoma

 Paper textWeir_Nature2007.doc
 Figure 1fig1.pdf
 Figure 2fig2.pdf
 Figure 3fig3.pdf
 Full supplementary informationWeir_supplement.pdf
 Sample information file - TSP samplesNewsampleinfo_TSP.xls
 Sample information file - non-TSP samplessampleinfo_non-TSP.xls
 Viewable file of GISTIC scores for high thresholdTSP_highthresh_scores.txt
 Viewable file of GISIC scores for low thresholdTSP_lowthresh_scores.txt

Subclass Mapping: Identifying Common Subtypes in Independent Disease Data Sets

 Breast-A: data setBreast_A.gct
 Breast-A: class labelsBreast_A.cls
 Breast-B: data setBreast_B.gct
 Breast-B: class labelsBreast_B.cls
 Multi-A: data setMulti_A.gct
 Multi-A: class labelsMulti_A.cls
 Multi-B: data setMulti_B.gct
 Multi-B: class labelsMulti_B.cls
 DLBCL-A: data setDLBCL_A.gct
 DLBCL-A: class labelsDLBCL_A.cls
 DLBCL-B: data setDLBCL_B.gct
 DLBCL-B: class labelsDLBCL_B.cls
 DLBCL-C: data setDLBCL_C.gct
 DLBCL-C: class labelsDLBCL_C.cls
 DLBCL-D: dara setDLBCL_D.gct
 DLBCL-D: class labelsDLBCL_D.cls
 HCclustid (generates .cls files from a clustering result)

Signature-Based Small Molecule Screening Identifies Cytosine Arabinoside as an EWS/FLI Modulator in Ewing Sarcoma

 Microsoft Excel sheet with supplementary tablesStegmaier_Supplementary_Tables.xls
 Data file with compound treated cell linesHTA_Ewings_CmpdTest_060510_ams.res
 Untreated control sample CEL
 ARA-C treated samples CEL
 Doxorubicin treated samples CEL
 Puromycin treated samples CEL
 Supplementary Figure 1Stegmaier_Supplementary_Figure1.pdf
 Supplementary Figure 2Stegmaier_Supplementary_Figure2.pdf
 Supplementary Figure 3Stegmaier_Supplementary_Figure3.pdf
 Supplementary Figure 4Stegmaier_Supplementary_Figure4.pdf
 Supplementary Table 1Stegmaier_Supplement.Table1.pdf
 Supplementary Table 2Stegmaier_Supplement.Table2.pdf
 Supplementary Table 3Stegmaier_Supplement.Table3.pdf
 Supplementary Table 4Stegmaier_Supplement.Table4.pdf
 Supplementary Table 5Stegmaier_Supplement.Table5.pdf

Metagene projection for cross platform, cross species characterization of global transcriptional states

 Readme file with instructions about how to run the codereadme.txt
 Leukemia 1 example: R code and
 Leukemia 2 example: R code and
 Lung example: R code and

Expression profiling of EWS/FLI identifies NKX2.2 as a critical target gene in Ewing's sarcoma

 Preview articleCancer Cell preview.pdf
 CEL file descriptionsSmith_et_al_CEL_file_list.xls
 CEL files for knockdown
 CEL files for inducible
 Supplemental appendix 1Smith, et al., supplemental appendix 1.doc
 Supplemental appendix 2Smith, et al., supplemental appendix 2.xls
 Supplemental appendix 3Smith, et al., supplemental appendix 3.xls

Identification of distinct molecular phenotypes in acute megakaryoblastic leukemia by gene expression profiling

 AMKL dataset, raw CEL
 AMKL dataset, sample information fileSample Information AMKL dataset.xls
 consensus clustering of non-DS AMKLConsensus_Clustering_nonDS_AMKL.xls
 marker selection by SAM: DS-TMD vs. DS-AMKLMarker_selection_SAM_DS_versus_TMD.xls
 marker selection by SAM: DS-AMKL vs. non-DS-AMKLMarker_selection_SAM_DS_vs_NDS.xls
 predictor of DS-AMKL vs. non-DS-AMKLPredictiionResults_d.xls
 PDF copy of preprintbourquin_pnas_2006.pdf

Allele-specific amplification in cancer revealed by SNP array analysis.


Gefitinib (Iressa) induces myeloid differentiation of acute myeloid leukemia

 Plain text file describing available supplementary informationREADME
 Microsoft Excel sheet with supplementary informationStegmaierSupplementalData050516.xls
 Gefitinib treated HL-60 cell line dataIressa_HL60_MeansScaling.res
 Gefitinib treated Kasumi cell line dataIressa_Kasumi_041201_ams.res
 Patient 1 (M3-AML) sample dataIressa_Patient1_ams.gct
 Patient 2 (M5-AML) sample dataIressa_Patient2_ams.res
 Patient 7 (M4-AML) sample dataIressa_Patient7_ams.res
 Primary patient AML cells sample dataMyeloid_Screen1_newData_021203_ams.AML_poly_mono.gct
 Gefitinib treated HL-60 cell line data CEL
 Gefitinib treated Kasumi cell line data CEL
 Patient 1 (M3-AML) sample data CEL
 Patient 2 (M5-AML) sample data CEL
 Patient 7 (M4-AML) sample data CEL
 Primary patient AML cells sample data CEL
 Manuscript Figure 1Stegmaier_Fig1_Final.pdf
 Manuscript Figure 2Stegmaier_Fig2_Final.pdf
 Manuscript Figure 3Stegmaier_Fig3_Final.pdf
 Manuscript Figure 4Stegmaier_Fig4_Final.pdf
 Manuscript Figure 5Stegmaier_Fig5_Final.pdf
 Manuscript Figure 6Stegmaier_Fig6_Final.pdf

A zebrafish bmyb mutation causes genome instability and increased cancer susceptibility

 MAGE formatted zebra fish crb mutant expression
 Whitehead gct formatted zebra fish crb mutant expression datasetcrash_and_burn.gct
 Class labels for the zebra fish expression datasetcrash_and_burn.cls
 Global Cancer Map (GCM) datasetGCM_All.gct
 Acute Lymphoblastic Leukemia (Golub et al)ALL_vs_AML_U95_test.res
 Adenocarcinoma with p53 mutation status (Beer et al)beer_lung_for_p53.gct
 Anaplastic Oligodendroglioma (Nutt et al)glioma_classic_hist.gct
 Classic Glioblastoma (Nutt et al)glioma_classic_hist.gct
 Glioblastoma Survival (Nutt et al)glioma_nutt_combo.gct
 Hepatic Carcinoma (Iizuka et al)hep_japan.gct
 Lung Adenocarcinoma Outcome (Beer et al)lung_annarbor_outcome_only.gct
 Lung Cancer Outcome (Bhatacharjee et al)lung_datasetB_outcome.gct
 Lymph Node Metastatic Gastric Adenocarcinoma (Chen et al)gastric_full_from_smad.paired_14.f1.5_g0.6.pcl
 Medulloblastoma (McDonald et al)med_macdonald_from_childrens.gct
 Medulloblastoma (Pomeroy et al)medullo_datasetC_outcome.gct
 Metastatic Tumors (Ramawamy et al)met.gct
 References to published datasets used in this analysisreferences_and_URLS_of_datasets.html
 Bmyb signature -- with expression levels (Table S1 full)bymb_signature_genes.htm
 Bmyb signature -- list of unique human genesbmyb_crb_plus_signature_genes.html
 Zebra fish p53 mutant and wild type CEL
 Zebra fish HU, Aph and control CEL
 Zebra fish 8g and MO treatment CEL
 Zebra fish bmyb crb, mo and control CEL
 Zebra fish cyclin WT and mutant CEL
 README for a description of zebrafish datasets usedREADME_FOR_INFO_ON_ZF_DATASETS.html

NFkB activity, function and target gene signatures in primary mediastinal large B-cell lymphoma and diffuse large B-cell lymphoma subtypes

 Supplementary InformationFF_NFKB_suppl_revised.pdf
 ASH '04 slidesASH04_friedrich.pdf
 Super-repressor expression data (RMA)super.rma.res.gz
 Super-repressor expression data (MAS5)super.mas5.res.gz

Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma.

 Supplementary Figure S1Garraway-s1.pdf
 Supplementary Figure S2Garraway-s2.pdf
 Supplementary Figure S3Garraway-s3.pdf
 Supplementary Figure S4Garraway-s4.pdf
 Supplementary Figure S5Garraway-s5.pdf
 Supplementary Figure LegendsGarraway-s6.doc
 Supplementary MethodsGarraway-s7.doc
 Supplementary Table S1Garraway-s8.doc
 Supplementary Table S2Garraway-s9.doc
 Supplementary Table S3Garraway-s10.doc
 Supplementary NotesGarraway-s11.doc

Homozygous deletions and chromosome amplifications in human lung carcinomas revealed by single nucleotide polymorphism array analysis.

 Supplemental dataZhao_2005_Suppl.pdf

MicroRNA Expression Profiles Classify Human Cancers

 Supplementary Table 1, probe informationsupplementary_table_1.xls
 Supplementary Table 2, sample informationsupplementary_table_2.xls
 Supplementary Table 3, N vs T predition resultsupplementary_table_3.xls
 Suppl. Table 4, poorly differentiated tumor prediction resultsupplementary_table_4.xls
 microRNA data, miGCM_218 collectionmiGCM_218.gct
 microRNA data, acute lymphoblastic leukemiaALL.gct
 microRNA data, for samples with both miRNA and mRNA dataCommon_miRNA.gct
 microRNA data, mouse lung samplesmLung.gct
 microRNA data, poorly differentiated tumorsPDT_miRNA.gct
 microRNA data, HL-60 differentiationHL60.gct
 microRNA data, erythroid differentiationErythroid.gct
 mRNA data, for samples with both miRNA and mRNA
 mRNA data, poorly differentiated
 microRNA data, raw
 Expression data in MAGE-ML
 Raw data in MAGE-ML
 Frequently Asked QuestionsFAQ_miGCM.html
 Supplementary NotesSupplementary_Notes.pdf

Molecular profiling of diffuse large B-cell lymphoma reveals a novel disease subtype with brisk host inflammatory response and distinct genetic features

 Supplementary InformationDLBCL_supplement.pdf
 Raw data (mean scaled, see supplement)LF_ms_dlbcl_new_womedia.res.gz
 Raw data (unscaled)LF_dlbcl_new_womedia.res.gz
 Sample annotationsample_annotation.xls
 2118 genes (Top 5% by F statistic)genelist.50mean.fstat95.txt
 4246 genes (Top 10% by F statistic)genelist.50mean.fstat90.txt
 ASH meeting presentation (slides)ASH04.pdf
 ASH meeting presentation (handouts)ASH04handouts.pdf
 UNIGENE and GenBank annotation of AffymetrixAffy2UnigeneGenBank.xls
 UNIGENE and GenBank annotation of LymphochipLymphochip2UnigeneGenBank.xls
 Lymphochip 2 Affymetrix mappingLymphochip2Affy.xls
 Consensus Clusters' MarkersLF_ms_dlbcl_new_womedia_fstat95.DB2.unique.xls

An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis

 All datasets in a single zipped fileDATASETS.RAR
 All figures and tables in a single zipped filefiguresandtables.rar
 All gene sets in a single zipped fileGENESETS.RAR
 Supplementary methods and description (ms doc)4679_3_supp_0_1100271809.doc

Genomic Approaches to Hematologic Malignancies


Molecular characterization of the tumor microenvironment in breast cancer.


A Transcriptional Profiling Study of CAAT/Enhancer Binding Protein Targets Identifies Hepatocyte Nuclear Factor 3beta as a Novel Tumor Suppressor in Lung Cancer

 Raw dataCAN_6-15-04_Halmos.xls

Genome coverage and sequence fidelity of phi29 polymerase-based multiple strand displacement whole genome amplification.


An integrated view of copy number and allelic alterations in the cancer genome using single nucleotide polymorphism arrays.


The Six1 Homeoprotein Stimulates Tumorigenesis via Reactivation of Cyclin A1

 gene expression datadata.res

Erra and Gabpa/b specify PGC-1a-dependent oxidative phosphorylation gene expression that is altered in diabetic muscle

 PGC-1-alpha timecourse (scaled expression data)Scaled_Expression_Data.xls
 Mouse promotersrefGene_mm3_PERI_TSS_1000
 Mouse:human masked promotersmm3_1k_1k_70_10_mouse
 Gene list for Figure 1 correlogram5034_Genes_Correlogram_Figure_1.xls

High-resolution single-nucleotide polymorphism array and clustering analysis of loss of heterozygosity in human lung cancer cell lines.

 Supplemental Figure 1Janne_s1.jpg
 Supplemental TablesJanne_supp_tables.doc

Metagenes and molecular pattern discovery using matrix factorization

 ALL-AML gene expression dataALL_AML_data.txt
 ALL-AML samplesALL_AML_samples.txt
 ALL-AML genesALL_AML_genes.txt
 Medulloblastomas gene expression dataMedulloblastoma_data.txt
 Medulloblastomas samplesMedulloblastomas_samples.txt
 Medulloblastomas genesMedulloblastoma_genes.txt
 Matlab M-file for NMFnmf.m
 Matlab M-file for reordering NMF consensus matricesnmforderconsensus.m
 supplemental informationNMF_final_supplement.pdf
 Matlab M-file for NMF (model selection)nmfconsensus.m
 Some papers making use of NMF codes (as of 8/07)NMF_code_used_8_07.doc
 NMF codes FAQNMF_codes_FAQ.doc

GeneCluster 2.0: An advanced toolset for bioarray analysis


dChipSNP: significance curve and clustering of SNP-array-based loss-of-heterozygosity data.


Gene Expression-Based High Throughput Screening (GE-HTS) and Application to Leukemia Differentiation

 Text file describing supplemental information contentsREADME
 Manuscript Figure 1Fig1_edited.pdf
 Manuscript Figure 2Fig2_edited.pdf
 Manuscript Figure 3Fig3_edited.pdf
 Manuscript Figure 4Fig4_edited.pdf
 Manuscript Figure 5Fig5_edited.pdf
 Supplemental Information DocumentStegmaier_SupplementaryMethodsFiguresTables.pdf
 Supplemental Information Excel WorksheetsStegmaier_SupplementaryData.xls
 Initial Myeloid Primary Cell DataMyeloid_primarycells.res
 Initial HL-60 Cell DataHL60_undiff_PMA_ATRA.res
 Compound Treated HL-60 Cell DataMyeloid_Screen_Compound_Eval.res
 Primary Patient APL Cell DataMyeloid_APL_compound_eval.res
 Initial Myeloid Primary Cell Data CEL FilesMyeloid_primarycells_CELfiles.tar.gz
 Initial HL-60 Cell Data CEL FilesHL60_undiff_PMA_ATRA_CELfiles.tar.gz
 Compound Treated HL-60 Cell Data CEL FilesMyeloid_Screen_Compound_Eval_CELfiles.tar.gz
 Primary Patient APL Cell Data CEL FilesMyeloid_APL_compound_eval_CELfiles.tar.gz
 Nature Genetics GE-HTS News and ViewsGE-HTS_NandV.pdf

Loss of heterozygosity and its correlation with expression profiles in subclasses of invasive breast cancers.


Microarray Data Mining: Facing the Challenges


Integrated Analysis of Protein Composition, Tissue Diversity, and Gene Regulation in Mouse Mitochondria

 Supplemental Table S1Supplemental%20Table%20S1.xls
 Supplemental Table S2Supplemental Table S2.xls
 Supplemental Table S3Supplemental Table S3.xls
 Supplemental Table S4Supplemental Table S4.xls

The molecular signature of mediastinal large B-cell lymphoma differs from that of other diffuse large B-cell lymphomas and shares features with classical Hodgkin lymphoma

 Raw (unormalized) data ('.res' format) gzip compressedMediastinal.res.gz
 Class membership (DLBCL or MLBCL?)Mediastinal.txt
 Supplementary InformationSupplementaryInfo.pdf
 CEL files (A chip)
 CEL files (B chip)
 Chip-to-sample mappingscan2sample.txt

Genome-wide loss of heterozygosity analysis from laser capture microdissected prostate cancer using single nucleotide polymorphic allele (SNP) arrays and a novel bioinformatics platform dChipSNP.

 Supplementary Figure 1Lieberfarb_s1.pdf
 Supplementary Table 1Lieberfarb_s2.pdf
 LOH DataLieberfarbSummary LOH data.txt

A Mechanism of Cyclin D1 Action Encoded in the Patterns of Gene Expression in Human Cancer

 Supplementary Information 1 (cyclin D1 nearest neighbors)SI1.txt
 Supplementary Information 2 (GCM KSS)SI2.txt
 Supplementary Information 3 (prostate KSS)SI3.txt
 Supplementary Information 4 (lung KSS)SI4.txt
 Supplementary Information 5 (brain KSS)SI5.txt
 Supplementary Information 6 (NCI60 KSS)SI6.txt
 Supplementary Information 7 (chromosome coordinates)SI7.pdf
 Supplementary Information 8 (competitor oligonucleotides)SI8.pdf
 Supplementary Information 9 (additional panels for Figure 6D)SI9.pdf
 Supplementary Information 10 (additional panels for Figure 6E)SI10.pdf
 raw gene expression dataraw_data.res

PGC-1a Responsive Genes Involved in Oxidative Phosphorylation are Coordinately Downregulated in Human Diabetes

 Human diabetes expression datareannotate_select_cal.eis.gz
 Phenotype dataPhenotype_Data.xls
 Probe sets corresponding to gene setsall_pathways.tar.gz
 GSEA results for NGT versus DM2GSEA_results_NGT_vs_DM2.xls
 OXPHOS homolog expression in mousemouse_expression.tar.gz

DNA Microarrays in Cancer: Realising the Promise of Personalised Medicine


Gene expression-based classification of malignant gliomas correlates better with survival than histological classification

 FiguresFigures 1, 2 and 3.ppt
 SupplementSupplementary Information - Cancer Research.doc
 Classics Res FileBrain_Classics.res
 Classics Class FileBrain_Classics.cls
 NonClassics Res FileBrain_NonClassics.res
 NonClassics Class FileBrain_NonClassics.cls
 CEL FilesGlioma CEL

Cancer Genomics and Molecular Pattern Recognition

 Paper (PDF)Humana_final_Ch_06_23_2002%20SR.pdf

Estimating Dataset Size Requirements for Classifying DNA Microarray Data

 Draft ManuscriptSample_size_fin.pdf

An Analytical Method For Multi-class Molecular Cancer Classification

 Paper (Word document)multiclass.siam_final_March_12_2003.pdf

Consensus Clustering: A resampling-based method for class discovery and visualization of gene expression microarray data

 Technical Reportconsensus4pdflatex.pdf
 Leukemia dataALB_ALT_AML.1000genes.res
 Leukemia class templateALB_ALT_AML.cls
 Novartis multi-tissue dataNovartis_BPLC.top1000.gct
 Novartis multi-tissue class templateNovartis_BPLC.cls
 St. Jude Leukemia dataleukemia.top1000.gct
 St. Jude Leukemia class templateleukemia.cls
 Lung cancer dataLungA_1000genes.gct
 Lung cancer class templateLungA_local.cls
 CNS tumors databrain_morpho.1000genes.res
 CNS tumors class templatebrain_morpho.cls
 Normal tissues datacGCM_9_15000_nml_90.top100.res
 Normal tissues class templatecGCM_9_15000_nml_90.cls
 Gaussian3 class templategaussian3.cls
 Gaussian4 class templategaussian4.cls
 Gaussia5 class templategaussian5.cls
 Simulated6 class templateartificial_dataset1.cls

Evidence for a Molecular Signature of Metastasis in Primary Solid Tumors

 Supplemental InformationMets_Supplement_Information_041110_KR.xls
 Dataset A - Global Cancer Map Tumor vs. Met.DatasetA_Tum_vs_Met.res
 Dataset B - Lung OutcomeDatasetB_Lung_outcome.res
 Dataset C - Rosetta Breast OutcomeDatasetC_Rosetta_breast_outcome.res
 Dataset D - Prostate OutcomeDatasetD_prostate_outcome.res
 Dataset E - Medulloblastoma OutcomeDatasetE_medulloblastoma_outcome.res
 Dataset F - LBC Lymphoma OutcomeDatasetF_lymphoma_outcome.res
 Table of ContentsTable_of_Contents.pdf

Identification of endoglin as a functional marker that defines long-term repopulating hematopoietic stem cells

 Supplementary informationendoglin.doc

A Strategy for Oligonucleotide Microarray Probe Reduction

 Description of these filesAboutTheseFiles.doc
 Paper in pdf formatAntipova_et_al_2002.pdf
 Raw feature data for all the genes on the chipsRawFeatureData.tar.gz
 Unscaled Delta(h), random Deltas, and Average DifferenceUnscaledResFiles.tar.gz
 Scaled Delta(h), random Deltas, and Average DifferenceScaledResFiles.tar.gz
 Cls files, idealized expression vectors for class assignmentsClsFiles.tar.gz
 Expanded Figure 2Fig2Features.xls
 Expanded Table 1, includes classification parametersTable1Features.xls
 List of selected Delta(h) probesListOfDeltaHprobes.xls

The Ewing's Sarcoma Oncoprotein EWS/FLI Induces a p53-Dependent Growth Arrest in Primary Human Fibroblasts

 Manuscript (PDF)CCELL.1_4_393.56.pdf
 Appendix 1: Overview of appendices and methods (MS Word)Appendix 1.doc
 Appendix 1: Overview of appendices and methods (PDF)Appendix 1.pdf
 Appendix 2: Expression data from the tet-EF cell expt (Excel)Appendix_2.xls
 Appendix 3: cls file for tet-EF cell knn analysis from fig. 2Appendix_3.cls
 Appendix 4a: cls file for Ewing's sarcomaAppendix_4a.cls
 Appendix 4b: cls file for Burkitt's lymphomaAppendix_4b.cls
 Appendix 4c: cls file for neuroblastomaAppendix_4c.cls
 Appendix 4d: cls file for rhabdomyosarcomaAppendix_4d.cls
 Appendix 5: limited tet-EF dataset (Excel) from fig. 2Appendix_5.xls
 Appendix 6: Limited SRCT dataset (Excel) from fig. 2Appendix_6.xls
 Appendix 7a: Ewing's-specific genes (Excel)Appendix_7a.xls
 Appendix 7b: Burkitt's-specific genes (Excel)Appendix_7b.xls
 Appendix 7c: neuroblastoma-specific genes (Excel)Appendix_7c.xls
 Appendix 7d: rhabdomyosarcoma-specific genes (Excel)Appendix_7d.xls
 Appendix 8: tet-EF cell upregulated genes (Excel)Appendix_8.xls
 Appendix 9: SOM clusters from tet-EF cells (Excel)Appendix_9.xls
 CEL files 1/2 (zip file, 15 MB)run1.tar.gz
 CEL files 2/2 (zip file, 14 MB)run2.tar.gz
 Sample descriptions for CEL filesrun_samples.txt

DNA Microarrays in Clinical Oncology

 Review ArticleJCO.pdf

Gene Expression Correlates of Clinical Prostate Cancer Behavior

 Prostate tumor and normal samplesProstate_TN_final0701_allmeanScale.res
 Prostate tumor samplesProstate_T_allmeansquare.res
 Prostate capsular penetration samplesProstate_CapPen_061901_allmeanscale.res
 Prostate surgical margin samplesProstate_Margin_allmeanscale.res
 Prostate outcome samplesProstate_nonrecur_vs_recur_scaled.res
 Document describing each of the downloadable files.Readme
 Prostate Normal Sample CEL files (N01 - N31) (65M)prostate_normal_N01-N31.CEL.tar.gz
 Prostate Normal Sample CEL Files (N32 - N62) (65M)prostate_normal_N32-N62.CEL.tar.gz
 Prostate Tumor Sample CEL files (T01 - T30) (66M)prostate_tumor_T01-T30.CEL.tar.gz
 Prostate Tumor Sample CEL files (T31 - T62) (66M)prostate_tumor_T31-T62.CEL.tar.gz
 Supplementary Information (pdf)SuppInfo_CCv3.pdf
 Figure 1, 2 and 3 from paper (pdf)Figure1_2and3.pdf
 Table 1 from paper (pdf).Table1v2.pdf
 Supplementary Figure 1 (pdf)Supplemental_Figure1.pdf
 Supplementary Figures 2 and 3 (pdf)Supplemental_Figure2and3.pdf
 Supplementary Figure 4 (pdf)Supplemental_Figure4.pdf
 Supplementary Figure 5 (pdf)Supplemental_Figure5.pdf
 Supplementary Figure Legends (pdf)SupplementalFigureLegendsRv1.pdf

Gene Expression-Based Classification and Outcome Prediction of Central Nervous System Embryonal Tumors

 Manuscript (PDF)Brain_Nature.pdf
 Supplementary Information document (MS Word)Pomeroy_et_al_0G04850_11142001_suppl_info.doc
 Figures (MS Power Point)Pomeroy_et_al_0G04850_11142001_figures.ppt
 Datasets and clinical table (ZIP, 13 Mbytes)
 Sample listCNS_samples_CEL_files_list.xls
 Raw data 1/4 (57 MB)scans_pt1.tar.gz
 Raw data 2/4 (57 MB)scans_pt2.tar.gz
 Raw data 3/4 (58 MB)scans_pt3.tar.gz
 Raw data 4/4 (58 MB)scans_pt4.tar.gz
 Dataset C (gene expression) RES file formatDataset_C_MD_outcome.res
 Dataset C (class labels) CLS file formatDataset_C_MD_outcome.cls
 Dataset C (clinical table) ASCII file formatBrain_4_MD_outcome_2.survival.tex

Diffuse Large B-Cell Lymphoma Outcome Prediction by Gene Expression Profiling and Supervised Machine Learning

 Supplemental Information (Microsoft Word)Shipp_et_al_Supplementary_Information_v5.doc
 Supplemental Information (pdf)Shipp_et_al_Supplementary_Information_v5.pdf
 DLBCL vs. FL morphology res filelymphoma_8_lbc_fscc2_rn.res
 DLBCL vs. FL morphology cls filelymphoma_8_lbc_fscc2.cls
 DLBCL outcome res filelymphoma_8_lbc_outcome_rn.res
 DLBCL outcome cls filelymphoma_8_lbc_outcome.cls
 Clinical Data Tablelymphoma_clinical_011127.xls
 Validation Marker Mapping - UniGene Mappinglymphoma_common_unigene.xls
 DLBCL CEL files (DLBC1 - DLBC29) (66M)Lymph_LBC_1-29.CEL.tar.gz
 DLBCL CEL files (DLBC30 - DLBC58) (66M)Lymph_LBC_30-58.CEL.tar.gz
 FSCC CEL files (FSCC1 - FSCC19) (43M)Lymph_FSCC_1-19.CEL.tar.gz
 Expanded Figure 5 from paperLymphoma_Shipp_et_al_Fig5.xls
 README describing downloadable filesReadme
 Paper (PDF)Shipp_et_al_2002.pdf

Multi-Class Cancer Diagnosis Using Tumor Gene Expression Signatures

 Manuscript (PDF)GCM.pdf
 Supplementary Information (PDF)PNAS_Supplementary_Information.pdf

MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia

 File infoMLL_supplemental_file_info.txt
 Scan ids, scaling factors and figure keyscaling_factors_and_fig_key.txt
 Expression data, scaled (Affymetrixexpression_data.txt
 Expression data as above, with Affymetrix A/P callsexpression_data_plus_APcalls.txt
 Raw data (CEL files) ALL scans part 1 (40MB)mll_scans_ALL1.tar.gz
 Raw data (CEL files) ALL scans part 2 (41MB)mll_scans_ALL2.tar.gz
 Raw data (CEL files) MLL scans part 1 (47MB)mll_scans_MLL1.tar.gz
 Raw data (CEL files) MLL scans part 2 (47MB)mll_scans_MLL2.tar.gz
 Raw data (CEL files) AML scans part 1 (33MB)mll_scans_AML1.tar.gz
 Raw data (CEL files) AML scans part 2 (34MB)mll_scans_AML2.tar.gz

Classification of Human Lung Carcinomas by mRNA Expression Profiling Reveals Distinct Adenocarcinoma Sub-classes

 Key to scan namesdatasetA_scans.txt
 Raw data (CEL files) ADENOS part 1 (~53MB)LUNG_scans_ADENO_part1.tar.gz
 Raw data (CEL files) ADENOS part 2 (~53MB)LUNG_scans_ADENO_part2.tar.gz
 Raw data (CEL files) ADENOS part 3 (~53MB)LUNG_scans_ADENO_part3.tar.gz
 Raw data (CEL files) ADENOS part 4 (~54MB)LUNG_scans_ADENO_part4.tar.gz
 Raw data (CEL files) ADENOS part 5 (~53MB)LUNG_scans_ADENO_part5.tar.gz
 Raw data (CEL files) ADENOS part 6 (~53MB)LUNG_scans_ADENO_part6.tar.gz
 Raw data (CEL files) ADENOS part 7 (~54MB)LUNG_scans_ADENO_part7.tar.gz
 Raw data (CEL files) ADENOS part 8 (~55MB)LUNG_scans_ADENO_part8.tar.gz
 Raw data (CEL files) ADENOS part 9 (~55MB)LUNG_scans_ADENO_part9.tar.gz
 Raw data (CEL files) ADENOS part 10 (~53MB)LUNG_scans_ADENO_part10.tar.gz
 Raw data (CEL files) Normal Lung (~48MB)LUNG_scans_NORM.tar.gz
 Raw data (CEL files) Small Cell (~17MB)LUNG_scans_SMC.tar.gz
 Raw data (CEL files) Squamous (~61MB)LUNG_scans_SQ.tar.gz
 Raw data (CEL files) Carcinoids (~56MB)LUNG_scans_COID.tar.gz
 DatasetA, all genes, rank-inv. scaled, averagedDatasetA_12600gene.txt.gz
 All scans, raw AFFY av.diff and A/P valsLung_DATASETA_scans_noscale.res.gz
 Variable genes used to cluster DatasetADatasetA_3312genesetdescription_sd50.txt
 DatasetB, all genes, rank inv. scaled, av'dDatasetB_12600gene_Fig2order.txt.gz
 DatasetB, 675 genesDatasetB_675gene.txt.gz

Chemosensitivity Prediction by Transcriptional Profiling

 Scaled expression data w/A_P callsNCI60_aug99_resfile.txt
 Drug sensitivity GI50 rawGI50_RAW.txt.gz
 List of scan namessamples_for_nci60paper.txt
 Raw CEL files, part 1 (20 scans, ~44MB)nci60_scans_part1.tar.gz
 Raw CEL files, part 2 (20 scans, ~44MB)nci60_scans_part2.tar.gz
 Raw CEL files, part 3 (20 scans, ~44MB)nci60_scans_part3.tar.gz
 Paper (PDF)Staunton_et_al_2001.pdf

Molecular Classification of Multiple Tumor Types

 Paper (PDF)Bioinformatics_200107.pdf

Genome-Wide Views of Cancer

 Paper (PDF)Golub_2001.pdf

Genomic analysis of metastasis reveals an essential role for RhoC

 Paper (PDF)Clark_et_al_2000.pdf
  Human A375 Table I (Excel)Human_data_set_A375_table_I.xls
  Human A375 Table II (Excel)Human_data_set_A375_table_II.xls
 Mouse B16 Table I (Excel)Mouse_data_set_B16_table_I.xls
 Human A375 Table I (Textl)Human_data_set_A375_raw1.txt
 Human A375 Table II (Text)Human_data_set_A375_raw2.txt
 Mouse B16 Table I (Text)Mouse_data_set_B16_raw.txt

c-Myc is a critical target for c/EBPalpha in granulopoiesis.

 Paper (PDF)Johansen_et_al_2000.pdf

Class prediction and discovery using gene expression data

 Paper (PDF)Slonim_et_al_2000.pdf
 Paper (PS)

Expression analysis with oligonucleotide microarrays reveals that MYC regulates genes involved in growth, cell cycle, signaling, and adhesion

 Supplementary datasets and tablesdata_set_myc_genes.html
 Supplementary figuresfigures_myc_genes.html
 MYC genes dataset: ASCIIdata_set_myc_genes.txt
 MYC genes dataset: Exceldata_set_myc_genes.xls
 Paper (PDF)Coller_et_all_2000.pdf

GENOMICS: Journey to the Center of Biology

 paper (PDF)LanderWeinberg.pdf

Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression

 Paper (PDF)Golub_et_al_1999.pdf
 Files descriptionFiles_descriptions.txt
 Experimental protocolprotocol.html
 Rescaling factorstable_ALL_AML_rfactors.txt
 Samples table (Word)table_ALL_AML_samples.rtf
 Samples table (text)table_ALL_AML_samples.txt
 Train dataset (Excel)data_set_ALL_AML_train.tsv
 Train dataset (text)data_set_ALL_AML_train.txt
 Test datset (Excel)data_set_ALL_AML_independent.tsv
 Test dataset (text)data_set_ALL_AML_independent.txt
 Prediction results (Word)table_ALL_AML_predic.rtf
 Prediction results (text)table_ALL_AML_predic.txt
 Original and supplemental figures (Powerpoint)Figures_original_plus_suppl.ppt
 Train dataset in WI formatALL_vs_AML_train_set_38_sorted.res
 Train dataset class vector in WI formatALL_vs_AML_train_set_38_sorted.cls
 Test dataset in WI formatLeuk_ALL_AML.test.res
 Test dataset class vector in WI formatLeuk_ALL_AML.test.cls

Interpreting patterns of gene expression with self-organizing maps

 Experimental protocolprotocol.html
 Dataset descriptionDatasets_description.txt
 Dataset data_set_HL60 (text)data_set_HL60.txt
 Dataset data_set_HL60 (Excel)data_set_HL60.tsv
 Dataset data_set_HL60_U937_NB4_Jurkat (text)data_set_HL60_U937_NB4_Jurkat.txt
 Dataset data_set_HL60_U937_NB4_Jurkat (Excel)data_set_HL60_U937_NB4_Jurkat.tsv