Sample Info (Attributes) file

Sample information files includes Attributes files, Sample Mapping files, Attribute Color files, and files that combine information. These are tab-delimited text files with extension .txt. IGV loads multiple sample information files per session.

When loaded into IGV, attributes display in a separate color-coded panel between sample names and tracks. See Sample Attributes and Sorting, Grouping, and Filtering for more information on displaying attributes and using attributes to manipulate tracks. IGV automatically assigns colors and heatmaps to attribute data values and what it determines are data ranges.

This page has the following sections.

Sample information files allow integrating diverse data tracks from the same sample or patient.

  • Tracks can be grouped based on the value of an attribute from the sample information file, such as a patient identifier. See the example in the Attributes files section.
  • Similarly, use to annotate VCF sample rows with metadata and allow grouping.
  • Sample information files can be used to overlay mutation tracks on other data tracks, e.g. expression or copy number data.

Overview of sample information file types

Consider your data visualization needs as the various sample information sets allow for different features of IGV. The decision tree table below matches use cases to the Sample Information file types.

Attribute, mapping, and color information may be in separate files, i.e. in Attributes files, Mapping files, and Color files, or in a single Sample Information file.

  • To save all three types of information in a single file, list attributes first, then mapping, and then color.
    • Between the information types, separate sections with row headers #sampleMapping and #colors.
    • Empty rows are not necessary and are ignored.
  • To differentially overlaying mutation tracks while still assigning attributes across data types, use a Modified Attributes file.

When loading attributes for datasets where sample names are identical across file types, no mapping information is necessary for the attributes to apply to the multiple data type tracks. However, to apply the same attribute information across datasets where sample names differ, you can use either of two different types of Sample Information sets as indicated by (b) and (c) in the table.

Multiple data track types?
No Yes
Do attributes sample labels match data track sample names?

No.

(a) Edit Attributes file to include the matching data track sample names in the first column.

(b) Load Attributes and Sample Mapping information.

Yes.

Attributes apply to data tracks.

 

 

 

No.

(b) Load Attributes and Sample Mapping information.

(c) Use Modified Attributes file that integrates mapping information. This format allows differential overlay of mutation tracks.

Yes.

Attributes apply to data tracks.

 

 

 

 

Attributes file

An Attributes file lists track identifiers in the first column and attributes in subsequent columns with a single header row. IGV matches the track identifiers in a data file with the track identifiers in the Attributes file.

For example, load the second example file on top of IGV hg19's CopyNumber: [genome_wide_snp_6__broad]. This data is found in the hosted server data The Cancer Genome Atlas>TCGA Broad GDAC>Firehose Standard Data>Broad Firehose Standard Data Run: 2015_02_04>BLCA-TP. Applying attributes to the data file allows sorting by copy number for the 22q13:32 loci and the pathology.M.stage attribute as shown in the Screenshot (2015.03.05) below.

Acceptable variations to the Attributes file

So long as the first row contains attribute labels and the first column sample names, the remaining rows may contain information pertaining to samples in any data type and be organized in any way.

  • Because IGV loads multiple Attributes files per session, it is not necessary to merge attributes into a single file.
  • Attributes only apply to data tracks with matching names. Attribute rows without matching data tracks do not display. So the information within the Attributes file need not overlap exclusively to the data tracks.
  • For data tracks without a matching attribute row, corresponding IGV attributes panel rows remain blank.

In the case of different data sets with different sample names from the same individual, e.g. copy number and RNA expression, you may wish to apply the information within a single attributes file in duplicate to the different data types. In this case, you may (b) additionally load a Sample Mapping file as outlined in the next section or (c) modify your Attributes file as outlined below.

For a single attributes file, duplicate the attributes by copy-pasting into empty rows, then modify sample names in the first column as needed for the differentially named datasets.

For multiple attributes files, duplicate the entire file and open each to modify sample names for the differentially named datasets as needed.

The Modified Attributes file includes a column indicating a linking identifier for use in mutation overlay.

 

Sample Mapping file

A Sample Mapping file links differential sample names to resolve disparities in sample naming within the Attributes file and the data files to which the attributes should apply. The file contains two tab-delimited columns.

Example 3 asks to download: BLCA_SampleMapping_3sets.txt

As with the Attributes file, information can be represented in a single or multiple Sample Mapping files.

  • Each file or section must start with row header #sampleMapping for IGV to recognize it as mapping information.
  • Any other rows starting with #text or empty rows are ignored, except #colors, and so you can use these elements to demarcate sections of the mapping information.
  • The first column contains data sample names. The second column contains Attributes file sample labels. Do not switch the order of the columns.

 

Attribute Colors

You can optionally specify colors for attribute values in RGB format for a specific label, a specific value, and as a heatmap scale for numeric columns in monocolor or in two-color heatmap for specified ranges. Customize colors using either a separate Attribute Colors file or by adding a colors section to the end of any other Sample Information file. Colors information is tab-delimited with three or four columns as shown in the example below.

column 1 column 2 column 3 column 4 (optional)  
Indicates attribute name Indicates attribute value or attribute range separated by a colon (:) Indicates color in RGB format. If used with column 4, then is the first color of a two-color heatmap Specifies the second color in RGB format in a two-color heatmap for attribute ranges
Example Explanation
#colors        
GENDER MALE 0,0,155   a value of  "MALE" for the "GENDER" column gets the color (0,0,155)
* Classical 80,180,80   a value of "Classical"  in any column gets the color  (80,180,80)
KarnScore * 0,0,255   numeric column example, monocolor heatmap
% Tumor Nuclei 90:100 0,0,255   another monocolor heatmap, this time with the range specified
sil_width -0.1:0.5 0,0,255 255,0,0 a two-color heatmap with the range specified
  • Colors information, either file or section, must be headed by a row with #colors.
  • An asterisk (*) in either of the first two columns indicates a wildcard. 
  • RGB values are separated by commas (,) without spaces and may be listed inside double quotations, e.g. 0,0,155 or "0,0,155".

Look up RGB values by color wheel at https://color.adobe.com/create/color-wheel/. Alternatively look up RGB values on a chart at http://www.rapidtables.com/web/color/RGB_Color.htm.

Briefly, RGB (red, green, and blue light) refers to a system of representing colors for computer display with zero representing absence and 255 giving maximum light for a color in comma-separated values. Example color RGB values are given below.

  • Red 255,0,0  
    Green 0,255,0  
    Blue 0,0,255  
    Yellow 255,255,0  
    Magenta 255,0,255  
    Cyan 0,255,255  
    Black 0,0,0  
    White 255,255,255