DisplaySupercontig

From ArachneWiki

Jump to: navigation, search
DisplaySupercontig
Function Visualization
Phase Analysis
Standard CLAs PRE, DATA, RUN, SUBDIR, GDB, NO_HEADER
Special CLAs ANNOT, FILE, use_RunTime
Source location ARACHNE_DIR/reporting

DisplaySupercontig is the most important visual output module. With it you can examine a supercontig and the contigs and reads that comprise it. If you have already run ComputeLibraryClusters, you can also see the supercontig's coverage by read library.

Contents

Running DisplaySupercontig

Argument name Argument type Default value Meaning
ANNOT String match.ano A file containing assembly annotations.
FILE String [None] A file where the postscript output is saved. If unspecified, a temp file in /tmp is used.
use_RunTime Bool False Whether or not to use RunTime during the DisplaySupercontig shell.

When run, the user is presented with some information about the assembly, and then a prompt:

Loading assembly...
Setting up ContigLocationManager.
Setting up ReadLocationManager.
Setting up ReadPairManager.
Setting up ReadDataManager.
Setting up ContigDataManager.
Setting up SuperDataManager.
Filling out ContigLocationManager.
Loaded 31 supercontigs.

The 10 largest supercontigs are: 28 30 1 2 25 3 9 19 18 29 
Expected coverage: ~12X
>

The Command-Line Interface

After starting up, DisplaySupercontig presents a command-line interface, like a shell. This interface accepts a variety of commands; to get a list of commands, hit enter (i.e., give no command.)

These commands load data to be displayed:

assem <sub_dir>      load assembly in the given sub_dir
annot <annot_file>   load annotations in <annot_file> in the current sub_dir

These commands display a piece of the specified entity:

s<id> [<location>] [<width>] [<magnification>]    show super <id>
c<id> [<location>] [<width>] [<magnification>]    show contig <id>
r<id> [<location>] [<width>] [<magnification>]    show read <id>

Each command optionally takes location, width, and magnification options:

@b, @e, or @<number>   at begin, at end, or at base <number> (default: center base)
:<number>              width, i.e., number of kilobases to show (default: full length)
x<number>              bases per pixel (default: 100)

These commands set various persistent display options:

set intra (on|off)       toggle display of intracontig links
set readids (on|off)     toggle display of read ids
set annot (on|off)       toggle display of annotations
set annotnames (on|off)  toggle display of annotation names

This command shows the current configuration:

info            display source data locations and display option settings

The Display

Here is an example of the output of DisplaySupercontig, as rendered by gv.

DisplaySupercontig output

The large "8" near the middle is the id of the contig, while the smaller numbers indicate kilobase coordinates along the contig.

Each arc represents an insert, with the end reads represented by flat triangles pointing in from each end of the arc. The arc height is proportional to the expected size of the insert (according to the library from which it came), so flatter arcs represent stretched inserts, while peaked arcs represent compressed inserts.

As a further visual cue, the arcs and triangles are colored along a spectrum from green (stretched) to gray (normal) to blue (compressed).

For example, on the left side of the image above, there is a small bright green arc, representing a stretched 4kb insert. As you can see, its observed length is about 5kb, so it is indeed stretched.

Conversely, on the right side of the image above, there is a larger bright blue arc, representing a compressed 10kb insert. In this case, its observed length is about 8kb.

Inserts where both reads point in the same direction appear red, and reads that are placed in the same super hundreds of kbp apart (i.e. probably chimeric) are colored pink and are connected by a dashed pink line. An example of this is just visible in the top left corner of the screenshot above.

The displays below the contig indicate coverage by library, clustered together by size. (The average insert sizes of the clusters are displayed at the far left end of the contig, and are not visible in the screenshot above.)


If we zoom in slightly, we can see individual reads more clearly.

DisplaySupercontig output, zoomed

Read colors

The coverage by each library size is broken down into coverage by inserts stretched or compressed to varying degree:

  • dark green: stretched by more than 3 standard deviations
  • light green: stretched between 1 and 3 standard deviations
  • gray: neither stretched nor compressed more than 1 standard deviation
  • light blue: compressed between 1 and 3 standard deviations
  • dark blue: compressed more than 3 standard deviations

This scale is similar to that used to color the arcs, but does not correspond exactly.

Reads whose partners have been deleted (usually because they are low quality) appear as gray outlines.

Reads whose partners exist but have not been placed anywhere appear as orange outlines.

Reads whose partners exist but have been placed in another supercontig appear as solid orange triangles with a dashed orange half-arc coming out of them. The id of the supercontig where its partner is placed is displayed at the top of the half-arc (in very very small type).

Personal tools