This page introduces viewing alignment data and its components on IGV in sections. Alignments are to a reference sequence and are used for different purposes that include:
In this guide, IGV feature examples are given for one datatype but concepts also apply to other datatypes. The order of the sections roughly on this page reflect features that become visible as one zooms in the view and cover:
Related topics on other pages cover more detailed topics:
Changes to certain display parameters in the Alignment Preferences panel should be made ahead of loading data. Some of these preferences can be overridden on a per-track basis through pop-up menu options or by loading saved sessions.
Default parameters are tuned to viewing DNA alignments that typically cover the entire genome at low coverage depth and filter out marked duplicate reads. Adjust Alignment Preferences panel parameters for RNA-Seq data, PCR-free whole genome sequences, and other data that deviate from the breadth and depth of coverage of typical DNA alignments.
For example, before loading RNA-Seq data, increase the Visibility range threshold to 500 without affecting IGV performance as expression data typically covers ~5% of the genome and the deeper coverage is by default downsampled. In addition, check Show junction track to visualize splice junctions.
Both BAM and SAM files are described on the Samtools project page http://samtools.sourceforge.net/ and in the 2014 article titled Sequence Alignment/Map Format Specification by the SAM/BAM Format Specification Working Group.
Sort and Index
IGV requires that the alignment file, whether BAM or SAM, is sorted and indexed by coordinates. Indexing produces a secondary file with either a BAI or SAI extension, respectively. The resulting file can be associated with the alignment track by file naming convention, or loaded independently as a separate track with the index query parameter.
Igvtools does not process BAM files as alternative resources such as Samtools have been historically available.
This section covers the default coverage track and a second type of coverage track, the extended coverage track.
For both types, the coverage track represents coverage for all the reads, whereas the reads displayed in the alignment track may only represent a fraction of the reads. This partial representation is called downsampling and occurs for deep read coverage areas to improve IGV performance.
Default Coverage Track
IGV dynamically calculates and displays the default coverage track for an alignment file. When IGV is zoomed to the alignment read visibility threshold (by default, 30 KB), the coverage track displays the depth of the reads displayed at each locus as a gray bar chart. If a nucleotide differs from the reference sequence in greater than 20% of quality weighted reads, IGV colors the bar in proportion to the read count of each base (A, C, G, T).
When the alignment data is loaded with its matching extended coverage data, the coverage track displays data at all zoom levels including at the whole genome and chromosome view. To generate the extended coverage data file ending in TDF extension, use igvtools. The resulting file can be associated with the alignment track by file naming convention or loaded independently as a separate track. TDF tracks loaded independently from an alignment do not display dynamically calculated features such as allele frequencies.
IGV reduces memory usage at two levels to improve performance. The first occurs as the threshold zoom at which alignments become visible and the second applies to areas of deep read coverage that are downsampled. We present these two levers in this section together because the settings for each combine to impact IGV performance. Users should adjust the following default settings, tuned for DNA alignments at low coverage, for specific data types in the Alignment Preferences panel.
E.g., for RNA-Seq alignments that cover extended regions at low depth, increase the visibility range threshold to view alignments at wider zoom levels, e.g. to 500.
Downsampled reads areas are marked with a black rectangle just under the coverage track. The coverage track represents coverage for all the reads.
In the example shown, the downsampled regions are consecutive and marked by seven black rectangles just under the coverage track.
When an alignment track is loaded, two tracks are displayed: (1) a coverage track and (2) the alignment track. Display of the default splice junctions track requires enabling the setting in the Alignment Preferences panel. This section gives an overview of the alignment track. For options available from the alignment track menu, including grouping, sorting and coloring options, see the alignments section of the pop-up menu page.
Detecting Structural Variants
IGV uses color and other visual markers to highlight potential genetic alterations in reads against a reference sequence. Genetic alternations include single nucleotide variations, structural variations, and aneuploidy. Structural variations include insertions, deletions, inversions, tandem duplications, translocations, and other more complex rearrangements. Interpretation of some of these variations are discussed briefy in this section and the next. Interpreting Color by Insert Size and Interpreting Color by Pair Orientation give more detailed explaination of read colors.
An additional factor to take into consideration when judging potential genetic alterations is quality of reads and quality of mapping. IGV uses transparency to indicate quality.
Colors and transparency are used at two levels within alignments: (1) for mapped reads, and (2) for individual bases within reads.
Color and Transparency for Individual bases
By default, read bases that match the reference are displayed in gray. Read bases that do not match are color coded, and insertions and deletions within reads relative to the reference are marked. Insertions are indicated by a purple I () and deletions are indicated with a black dash (–). In addition, mismatched bases are assigned a transparency value proportional to the read quality known as the phred score. This has the effect of de-emphasizing low quality reads.
Transparency for Mapped Reads
Note that alignments that are displayed with light gray borders and transparent or white fill, as shown in the screenshot, have a mapping quality equal to zero. Interpretation of this mapping quality depends on the mapping aligner as some commonly used aligners use this convention to mark a read with multiple alignments. In such a case, the read also maps to another location with equally good placement. It is also possible the read could not be uniquely placed but the other placements do not necessarily give equally good quality hits.
In a gapped alignment, IGV indicates insertions with respect to the reference with a purple I () or red I for insertions greater than a user activated and specified cutoff. Hover over the insertion symbol to view the inserted bases.
In a gapped read, IGV indicates deletions with respect to the reference with a black bar.
Coloring and Sorting Alignments
Users can also specifiy color and also sort reads by various options, including start location, strand, nucleotide, mapping quality, sample tag, or read group tag. For a description of all user-specified color and sort options, see the alignment track pop-up menu.
For example, to sort alignments:
Sorting rearranges rows so that alignments that intersect the center of the display appear in the order specified. This can cause the alignment layout away from the center line to appear sparse. To restore the layout to an optimally packed configuration, select Re-pack alignments from the pop-up menu.
Repeat the most recent sort with hotkey ctrl-s.
IGV provides several features for working with paired-end alignments. This section covers viewing reads as pairs, coloring of mapped paired reads, and the split-screen view. Interpretation of colors is discussed briefy here and in more detail in Interpreting Color by Insert Size and Interpreting Color by Pair Orientation.
By default, IGV displays reads individually because they pack compactly. Select View as pairs from the right-click menu to display pairs together with a line joining the ends as shown in the image below. The hover element details (2) are also displayed either for a single read in normal view (left) or for a pair of reads in paired reads view (right).
Coloring of Mapped Paired Reads
IGV colors paired-end alignments in two ways.
Split screen views can be invoked on-the-fly from paired-end alignment tracks. Right-click over an alignment and select View mate region in split screen from the drop-down list. If the alignment clicked over does not have a mapped mate this option will be grayed out.
Split-screen view shortcuts: