Cancer Program Publication

A Strategy for Oligonucleotide Microarray Probe Reduction
ProjectBioinformatics & Computational Biology
Abstract 
Background:
One of the factors limiting the number of genes analyzable on high density oligonucleotide arrays is that each transcript is probed by multiple oligonucleotide probes of distinct sequence in order to magnify the sensitivity and specificity of detection. Over the years, the number of probes per gene has decreased, but still no single array for the entire human genome has been reported. To reduce the number of probes required for each gene, a robust systematic approach for choosing the most representative probes is needed. Here, we introduce a generalizable empiric method for reducing the number of probes per gene while maximizing the fidelity to the original array design.

Results:
The methodology has been tested on a dataset comprised of 317 Affymetrix HuGeneFL GeneChips. The performance of the original and reduced probe sets was compared in four cancer classification problems. The results of these comparisons demonstrate that the reduction of the probe set by 95% does not dramatically affect performance, and thus illustrate the feasibility of substantially reducing probe numbers without significantly compromising sensitivity and specificity of detection.

Conclusions:
The strategy described here is potentially useful for designing small, limited-probe genome-wide arrays for screening applications.

AuthorsAlena A. Antipova, Pablo Tamayo, and Todd R. Golub
Publication Date11/25/2002
Contact emails golub@genome.wi.mit.edu
Publication URLhttp://genomebiology.com/2002/3/12/research/0073
CitationGenome Biology 2002, 3(12):research0073.1?0073.4
Keywordsprobe pair; probe selection; Average Difference
 
Supplemental Information
Files
DescriptionFile
Description of these filesAboutTheseFiles.doc
Paper in pdf formatAntipova_et_al_2002.pdf
Raw feature data for all the genes on the chipsRawFeatureData.tar.gz
Unscaled Delta(h), random Deltas, and Average DifferenceUnscaledResFiles.tar.gz
Scaled Delta(h), random Deltas, and Average DifferenceScaledResFiles.tar.gz
Cls files, idealized expression vectors for class assignmentsClsFiles.tar.gz
Expanded Figure 2Fig2Features.xls
Expanded Table 1, includes classification parametersTable1Features.xls
List of selected Delta(h) probesListOfDeltaHprobes.xls