Note: If you are using Excel to edit GenePattern files, be sure to save the file as a tab-delimited text file and supply the correct file extension. You can specify the file name in quotes to prevent Excel from appending .txt to the file name. Also, note that Excel's auto-formatting can introduce errors in gene names, as described in Zeeberg, et al (2004).
TXT File Format
The TXT format is a tab delimited file format that describes an expression dataset. It can be used with the GSEA module and is organized as follows:
The first line contains the labels Name and Description followed by the identifiers for each sample in the dataset. The Description is optional.
Name(tab)Description(tab)(sample 1 name)(tab)(sample 2 name) (tab) ... (sample N name)
Name Description DLBC1_1 DLBC2_1 ... DLBC58_0
The remainder of the file contains data for each of the genes. There is one line for each gene. Each line contains the gene name, gene description, and a value for each sample in the dataset. If the first line contains the Description label, include a description for each gene. If the first line does not contain the Description label, do not include descriptions for any gene. Gene names and descriptions can contain spaces since fields are separated by tabs.
(gene name) (tab) (gene description) (tab) (col 1 data) (tab) (col 2 data) (tab) ... (col N data)