Whole-genome sequence assembly for mammalian genomes: Arachne 2.

Genome Res
Authors
Keywords
Abstract

We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal changes were simultaneously made and applied to the assembly of the mouse genome, during a six-month period of development: (1) Supercontigs (scaffolds) were iteratively broken and rejoined using several criteria, yielding a 64-fold increase in length (N50), and apparent elimination of all global misjoins; (2) gaps between contigs in supercontigs were filled (partially or completely) by insertion of reads, as suggested by pairing within the supercontig, increasing the N50 contig length by 50%; (3) memory usage was reduced fourfold. The outcome of this mouse assembly and its analysis are described in (Mouse Genome Sequencing Consortium 2002).

Year of Publication
2003
Journal
Genome Res
Volume
13
Issue
1
Pages
91-6
Date Published
2003 Jan
ISSN
1088-9051
DOI
10.1101/gr.828403
PubMed ID
12529310
PubMed Central ID
PMC430950
Links