Mutation Files

MAF (mutation annotation format) and MUT (mutation) files display mutations. IGV recognizes text-based files with .maf, .maf.txt, .mut, and .mut.txt file extensions as mutation files, but not binary files. IGV will display mutation files as independent tracks or overlaid on other data tracks, depending on your Mutations Preferences settings.

  • MAF files that originate from The Cancer Genome Atlas (TCGA) project follow fixed conventions that allow IGV to visualize and overlay the files directly. The format is detailed here.
  • All other MAF files require testing and may need modification for visualization or overlay. For overlay, if sample names do not follow TCGA conventions, they must match, or you must use a linking identifier.
  • The MUT file format is specific to IGV and has five fixed columns as described here. To overlay MUT file data on other tracks, sample names must follow TCGA conventions, match, or use a linking identifier.

There are two ways to open multiple MUT or MAF files at once on IGV:

  1. Select all the files to be visualized from your system's file manager and drag-drop into IGV.
  2. Alternatively, load a single MUT or MAF file containing multiple sample information. For instructions on merging multiple text files, see How to concatenate multiple text files. The link also outlines how to convert a MAF file to MUT format and how to display multiple tracks in collapsed format.

IGV will visualize each individual sample's mutation data as a single track.

  • The all chromosomes view summarizes mutations in a coverage track (Screenshot below, 2015.02.18).
  • Zooming in, individual chromosome views and more detailed views mark sites of mutations with open rectangles. Default settings display these in black-and-white.
  • Color code mutations by mutation type (Screenshot above, 2015.02.18) by checking the Color code mutations box under View>Preferences>Mutations. See the Preferences page for more details.
  • Mouse-over or click on a mutation to bring up an information panel on the specific mutation. This panel displays the information provided in the mutation file columns, in order, up to an area limit.
  • A site where both alleles are mutated, or is mutated in multiple samples in a track that is a conglomerate of multiple samples, displays the rectangle with a horizontal line through the middle.

 

Overlay mutation tracks on other data tracks

By default, IGV displays mutations file data in distinct tracks. Overlay uses the Mutations tab of the Preferences window to modify display options. Do not use the right-click pop-up menu options Create Overlay Track nor Separate Tracks.

  • IGV will overlay multiple mutation tracks for the same data file.
  • To overlay mutation data on other data tracks, the sample names must follow TCGA conventions or match. If your sample names are neither, see the next section about linking identifiers.
  • For TCGA data, whether in MAF or MUT format, merged or multiple files, IGV matches to the patient identifier portion of the TCGA barcode, that is the first 10 digits of the barcode excluding hyphens, and is indifferent to information that follows in the barcode, e.g. tissue origin as marked by the 11th and 12th positions of the barcode. TCGA barcode IDs are described here. The implications of how IGV handles these barcodes is discussed on the MAF file format page. To differentially overlay TCGA mutation data for sample tracks from the same patient, see descriptions of the Modified Attributes file in Sample Information Files.
  1. Load the mutation data and the other data to which it will be overlaid, e.g. a GCT expression file of RNA-Seq data.

  2. Go to View>Preferences>Mutations and check the box Overlay mutation tracks. Press OK. Tracks will overlay (Screenshot below, 2015.02.20).

    1. If the overlay box was checked before you loaded your two data types, the data may already be overlaid, may not be overlaid, or may be represented twice--once overlaid and also displayed as separate tracks. There are two actions to overlay data and remove duplicate independent mutation tracks:

      1. Uncheck the Overlay box, press OK, recheck the box and press OK. This overlays the tracks and removes duplicate separate tracks.

      2. You may need to quit and restart IGV to clear previously loaded data as starting a new session may not have cleared previous mutation data. This is a bug that will be fixed. The number of tracks indicated in the lower left corner of IGV should be consistent with what you are loading.

  3. To remove mutation tracts that do not have a corresponding partner track, uncheck the box Show orphaned mutation tracks.

    1. If the Show orphaned mutation tracks was unchecked before loading your mutation data, and your preferences are set not to overlay mutations, mutation data may not display. Be sure this is checked before loading data.

  4. To color code mutations by type, e.g. misense, silent, etc., check box Color code mutations.

  1. To separate overlaid tracks, go to View>Preferences>Mutations and uncheck the box Overlay mutation tracks. Press okay to save preferences. Restart IGV and load data again.

Overlaid mutation tracks give an additional Sort by mutation count option for a selected region of interest.

Right-click on the ROI marked in red at top and select Sort by mutation count from the pop-up menu. For example, in the following Screenshot (2015.02.20) the overlaid tracks for the ERCC2 locus are reordered to display those with mutations at the top of the window.

 

Using linking identifiers to overlay tracks

To overlay mutations on data tracks with differing sample names, overlay tracks using one of the following two approaches. Each approach uses different Sample Information files but both use the configurable linking identifier which you must indicate in the Mutations Preferences panel, where the default is set to LINKING_ID.

  1. Load a Sample Mapping file that indicates matching sets, and specify the second column header as the linking identifier. For this particular approach, omission or inclusion of the #samplemapping file header gives different results. Also, this approach allows loading a separate Attributes file.
    1. To keep mutation tracks separate, with the option to overlay on to other tracks using the linking identifier, omit use of the #SampleMapping file header.
    2. Inclusion of the file header #SampleMapping causes IGV to automatically overlay mutation tracks as the mapping file is loaded, irregardless of linking identifier.
    3. In either instance, checking or unchecking the mutation overlay box hides or shows independent mutation tracks.
      1. Although unchecking the mutation overlay box grays out the linking identifier, it is still active through following sessions.
      2. To inactivate and clear a linking identifier, type an unrelated text string in the identifier field, click ok to save to preferences, and restart IGV.
  2. Or load a Modified Attributes file containing a column that links identifiers. Specify the column header that is the linking identifier.

Illustration of the two use cases

Click on the links to download the given example files to your desktop.

Example files & description Example file preview

GCT format expression data & SEG format segmented copy number data.

MUT format mutation file with two mutations visible once zoomed to chromosome 17. The fourth column of a MUT file always refers to samples.

MAF formats are also accepted.

Sample information set 1:

Sample Mapping file and Attributes file. The mapping file omits the typical row header (#SampleMapping), and contains a linking column headed by LINKING_ID, which refers to labels in the fourth column of the MUT file.

Mapping files contain two columns with the linking identifiers in the second column.

Sample information set 2:

Modified Attributes file where the linking column header is Sample and refers to the labels in the fourth column of the MUT file.

Any column except the first may be used for the linking information. This allows use of an existing attribute column as the linking identifier column.

 

Follow these steps to visualize example data as shown in the Screenshot below (2015.03.09):

  1. Open desktop IGV and load human hg18. Drag-drop downloaded GCT, SEG, and MUT data files into IGV window, then load either Sample information set 1 or 2.
    1. Quit and reopen IGV to clear any previous sample information and mutation data.
  2. Check the mutation overlay box under View>Preferences>Mutations. Indicate the linking identifier column and click OK. Mutations are now overlaid on linked data tracks.
    1. The default LINKING_ID is applicable for set 1.
    2. For set 2, define linking attribute column as Sample.
  3. Unchecking the mutation overlay box will display mutation overlaid data tracks and independent mutation tracks.
  4. To remove overlaid mutation tracks from data tracks, you must first uncheck the mutation overlay box, then restart IGV.