Computing SNP Copy Number and Loss of Heterozygosity  Print-icon


In cancer genomics, copy number change is one of the hallmarks of the genetic instability common to most human cancers and loss of heterozygosity (LOH) of tumor suppressor genes is a crucial step in the development of sporadic and hereditary cancer (Monti, 2005). Using modules available in GenePattern, you can compute SNP copy number and LOH based on Affymetrix SNP chip data for paired target/normal samples and then view them in the Integrative Genomics Viewer (IGV). The following modules are used for this computation, with IGV at the end for viewing the results:


SNPFileCreator converts the .CEL files from an Affymetrix array into a GenePattern .SNP file. Raw data for the probes in each SNP probe set are converted to a single intensity value per SNP using one of four modeling algorithms: Average Difference, PM/MM Difference Model (dChip, the default), Median Probe, or Trimmed Mean. Note that processing times for this module can average upwards of 30 minutes, depending on the speed of the server, the size of the dataset, and available memory. At least 2GB of memory are needed to run most SNPFileCreator jobs.

SNPFileCreator Inputs, Parameters, and Considerations

For more information about SNPFileCreator please see the SNPFileCreator Documentation


For gender-specific samples, run the XChromosomeCorrect module on the output of SNPFileCreator to correct intensity values for SNPs on the X chromosome. For each sample from a male donor, the module doubles the intensity value for SNPs on the X chromosome.

XChromosomeCorrect Inputs, Parameters and Considerations

The sample information file describes the SNP array and must be tab-delimited, include a column labeled Gender that contains a value of M or F for each sample and include target/normal paired samples for copy number and LOH determination. (More information on file formats can be found here)

For more information about XChromosomeCorrect please see the XChromosomeCorrect Documentation


CopyNumberDivideByNormals computes the raw copy number of each target SNP by dividing its intensity value by the mean intensity value of all normal SNPs. This calculation is referred to as copy number normalization or normalization with respect to normals.

CopyNumberDivideByNormals Inputs, Parameters, and Considerations

For more information about CopyNumberDivideByNormals please see the CopyNumberDivideByNormals Documentation


The LOHPaired module detects loss of heterozygosity (LOH). It takes as input a GenePattern .SNP
file that contains paired normal-target samples with genotype calls. (LOHPaired accepts only nonallele-
specific .SNP files; .SNP files that contain one intensity value per probe.) It returns as output a
GenePattern .LOH file that contains, for each probe, the LOH calls for each array pair.

LOH call values are as follows.

Call Value
L LOH: AB in normal and A or B in tumor
R Retention: AB in both normal and tumor or No Call in normal and AB in tumor
C Conflict: A or B in normal and AB in tumor
N Non-informative call: A or B in normal
No call: No Call in normal or tumor

LOHPaired Input, Parameters, and Considerations


The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated datasets. It supports a wide variety of data types and provides easy access to genomes and datasets hosted by the Broad Institute.

Adding a track line to view LOH data

Specifier Value Description
name track label Track name (ignored when used in the IGV file format)
description center label Currently ignored
visibility full | dense | hide Currently ignored
color RRR,GGG,BBB Color for positive values in all tracks
altColor RRR,GGG,BBB Color for negative values in all tracks
priority N Currently ignored
autoScale on | off Currently ignored; all tracks autoscale unless an explicit data range is defined (e.g., by including the viewlimits specifier).
gridDefault on | off Currently ignored
maxHeightPixels max:default:min Default and min are supported; max is currently ignored
graphType bar | points | heatmap Scatter plot | heatmap. IGV only: The heatmap value is an IGV addition to the WIG specification.
midRange x:y Defines the neutral range for a three-color heatmap. Values in this range are rendered with the midColor value, which is white by default. Example: midRange=20:80 IGV only: This specifier is an IGV addition to the WIG specification.
midColor RRR,GGG,BBB Color to use in the "mid range" of a heatmap. Example: midColor=0.0.150 IGV only: This specifier is an IGV addition to the WIG specification.
viewLimits lower:upper Defines the data range
yLineMark real-value Currently ignored
yLineOnOff on | off Currently ignored
windowingFunction maximum | minimum | mean Function that summarizes the values in a window of data represented by one pixel
smoothingWindow off | [MATKC:2-16] Currently ignored
coords 0 | 1 Indicate whether the file uses 0 or 1 based coordinates.The UCSC specification for WIG files uses 1 based coordinates and for BED files uses 0 based coordinates. If data looks off by one, check for a possible 0 vs 1 based coordinate issue. IGV only: This specifier is an IGV addition to the WIG specification.

Launching IGV and Viewing your data

To launch IGV and view your Copy Number and/or LOH data:

For more information on navigating or displaying data in IGV please see the IGV User Guide.

<< RNA-seq QC in GenePattern Up Using ComparativeMarkerSelection for Differential Expression Analysis >>

Updated on August 31, 2012 13:37