Note: If you are using Excel to edit GenePattern files, be sure to save the file as a tab-delimited text file and supply the correct file extension. You can specify the file name in quotes to prevent Excel from appending .txt to the file name. Also, note that Excel's auto-formatting can introduce errors in gene names, as described in Zeeberg, et al (2004).
RES File Format
The RES file format is a tab delimited file format that describes an expression dataset. The main differences between RES and GCT file formats are the RES file format (1) contains labels for each gene's absent (A) versus present (P) calls as generated by Affymetrix's GeneChip software and (2) does not allow missing expression values. The file is organized as follows:
The first line contains a list of labels identifying the samples associated with each of the columns in the remainder of the file. Two tabs (\t\t) separate the sample identifier labels because each sample contains two data values (an expression value and a present/marginal/absent call).
Description (tab) Accession (tab) (sample 1 name) (tab) (tab) (sample 2 name) (tab) (tab) ... (sample N name)
Description Accession DLBC1_1 DLBC2_1 ... DLBC58_0
The second line contains a list of sample descriptions. Currently, GenePattern ignores these descriptions.
(tab) (sample 1 description) (tab) (tab) (sample 2 description) (tab) (tab) ... (sample N description)
For example, our RES file creation tool places the sample data file name and scale factors in this row:
MG2000062219AA MG2000062256AA/scale factor=1.2172 ... MG2000062211AA/scale factor=1.1214
The third line contains a number indicating the number of rows in the data table that is contained in the remainder of the file. Note that the name and description columns are not included in the number of data columns.
(# of data rows)
The rest of the data file contains data for each of the genes. There is one row for each gene and two columns for each of the samples. The first two fields in the row contain the description and name for each of the genes (names and descriptions can contain spaces since fields are separated by tabs). The description field is optional but the tab following it is not. Each sample has two pieces of data associated with it: an expression value and an associated Absent/Marginal/Present (A/M/P) call. The A/M/P calls are generated by microarray scanning software (such as Affymetrix's GeneChip software) and are an indication of the confidence in the measured expression value. Currently, GenePattern ignores the Absent/Marginal/Present call.