Chromosome

From ArachneWiki

Jump to: navigation, search

A chromosome is a large-scale DNA sequence found in cells and the largest component of a genome. Telomeres appear at either end of a chromosome.

Arachne does not deal with chromosomes directly, but a completed assembly can give a lot of useful information about the chromosomes in an organism. Ideally, the supercontigs correspond directly to the chromosomes and can be mapped; we will discuss such mappings here.

Mapping supercontigs to chromosomes

Arachne provides mechanisms to assign supercontigs to chromosomes via genetic markers, FISH, or an optical map. Here we describe mapping via genetic markers and FISH.

For our purposes, a "genetic map" will be any mapping based on a list of markers, their positions on chromosomes, and the marker sequence. The sequences are aligned to the assembly using a native Arachne aligner (QueryLookupTable) and the mapping proceeds via the marker names.

Alternately, "insert mapping" refers to any mapping where the markers are inserts (read-pairs) found in the assembly. Marker position is not determined by alignment, but rather by the position of the insert in the assembly. This type of mapping applies to FISH data, as well as fingerprint maps.

Methodology

  1. Convert the map from whatever format it was presented in into something digestible by the rest of the mapping code. The digestible map is a text file, one marker per line, three tab-delimited items per line, representing in order the chromosome ID, the marker name, and the position on the chromosome. Arachne provides a mechanism to do this via the module SetupMap, but it is not necessary to use it.
  2. If you have a genetic map, align the markers to the assembly using QueryLookupTable. Run QueryLookupTable with PARSEABLE=True and QUERY_NAMING=from_record.
  3. Take the markers and their locations on an assembly and combine them into one 'hits' file via the marker name. This is done using the module SetupHits.

Display

You can display the mapping either as a text file via PutMarkersOnAssembly or graphically via DisplayMapping.

DisplayMapping image: Stickleback genome linkage group I: The top line is the map, with vertical lines representing the individual markers.  The marker overwritten with an "X" has been declared "suspect" at some point and its mapping is not shown.  The bottom line is the assembly, with the different scaffolds in different colors.  Scaffold number is displayed below the line.  Thin, vertical lines represent gaps in the scaffold.  Black lines joining the two reveal the association between the map and the assembly.
DisplayMapping image: Stickleback genome linkage group I: The top line is the map, with vertical lines representing the individual markers. The marker overwritten with an "X" has been declared "suspect" at some point and its mapping is not shown. The bottom line is the assembly, with the different scaffolds in different colors. Scaffold number is displayed below the line. Thin, vertical lines represent gaps in the scaffold. Black lines joining the two reveal the association between the map and the assembly.
Personal tools