Note: If you are using Excel to edit GenePattern files, be sure to save the file as a tab-delimited text file and supply the correct file extension. You can specify the file name in quotes to prevent Excel from appending .txt to the file name. Also, note that Excel's auto-formatting can introduce errors in gene names, as described in Zeeberg, et al (2004).

CN File Format

This is a tab-delimited file format that contains SNP copy numbers. It contains one row for each SNP and one column for each SNP array: the raw copy number value. It is organized as follows:

  1. The first line contains a list of labels identifying the SNP arrays.
    • Line format:
      SNP (tab) Chromosome (tab) PhysicalPosition (tab) (array_1_name) (tab) ... (array_N_name)
    • For example:
      SNP (tab) Chromosome (tab) PhysicalPosition (tab) MYNAH_p_Affy_plate_9_Mapping250K_Sty_A01_49068 (tab) ... MYNAH_p_Affy_plate_9_Mapping250K_Sty_A01_49084
  2. The rest of the SNP file contains one row of data for each SNP.
    • Line format:
      (snp) (tab) (chromosome) (tab) (position) (tab) (array_1_cn) (tab) ... (array_N_cn)
    • For example:
      SNP_A-4249904 (tab) 17 (tab) 41420045 (tab) 2.265 (tab) ... 1.735

Note: Sort the SNPs by chromosome and physical position (low to high). Most GenePattern modules, as well as many external tools, require sorted data.

Sample CN file:

