Sequencing & Assembly

Whole Genome Shotgun Overview

This genome was sequenced using 454 Whole Genome Shotgun methodology, whereby:

  1. DNA is shattered into small fragments (~0.6kb or~3kb)
  2. 0.6kb fragments are tailed with 454 sequencing adapters
  3. 3kb fragments are circularized on a biotinylated linker, circles are sheared, fragments containing biotinylated linker are retrieved and tailed with 454 sequencing adapters.
  4. Adapterized fragments are sequenced from one end, creating fragment or paired reads.

Newbler (454 Life Sciences) Assembly

  1. The assembly process uses fragment and paired reads to identify contiguous stretches of sequence (contigs)
  2. Contigs are ordered and linked together into larger supercontigs by using paired reads lying in different contigs

For more info on 454 and Newbler assembler see: http://www.454.com

Supercontig/Contig Numbering

  • Supercontig and contig numbers are preceded by the version of the assembly. For example:
    • Contig 1.25 - refers to contig number 25 within assembly 1.
    • Supercontig 1.2 - refers to supercontig number 2 within assembly 1.
  • Supercontigs are numbered in order of decreasing length. For example, supercontig 1.1 is the largest, and supercontig 1.2 is the next largest.
  • Contigs within supercontigs are ordered positionally. For example, supercontig 1.1 contains contigs 1,2,3... (in that order).

There is no correspondence between contig or supercontigs numbers in different assemblies.

Assembly Version 2

The improved version of the assembly, produced by our collaborators, consists of 67 contigs (n50 = 3.46 Mb) and a single 42 Kb mitochondrial chromosome compared with 431 contigs (n50 = 0.218 Mb) for the previous assembly. The new version was constructed by manual finishing and a Bacterial artificial chromosome (BAC) map. The BAC map was generated and assigned to scaffolds by fingerprint mapping with BACFinder, and by sequencing BAC ends. Telomere ends were identified in all but chromosome 6, and one end each of chromosomes 1, 12, and 13 using TERMINUS (Li et al, 2005). Contig joins were performed by manual inspection of the previous Arachne assembly in conjunction with unplaced and putative chimeric reads within Consed, and the ordering of contigs from the BAC map. In addition, PCR primers were designed for pairs putatively neighboring contig ends providing primary evidence closing 96 gaps. Re-analysis of the original fosmid reads using Consed, and a limited number of additional PCR reactions, gave evidence closing 60 more gaps. Alignment of the contigs from the previous assembly revealed 88 cases in which contigs overlapped. In 91 additional cases neighboring contigs were joined by abutting them without known joining sequence. Eighty one contigs from the previous assembly (484,164 nt, or 1.3 percent of the total sequence) could not be placed on chromosomes.