ALLPATHS: de novo assembly of whole-genome shotgun microreads.

Genome Res

Authors	Jonathan Butler Iain MacCallum Michael Kleber Ilya Shlyakhter Matthew Belmonte Eric Lander Chad Nusbaum David Jaffe
Keywords	Computational Biology Algorithms Reproducibility of Results Computer Simulation Sequence Analysis, DNA Genome, Bacterial Escherichia coli Campylobacter jejuni
Abstract	New DNA sequencing technologies deliver data at dramatically lower costs but demand new analytical methods to take full advantage of the very short reads that they produce. We provide an initial, theoretical solution to the challenge of de novo assembly from whole-genome shotgun "microreads." For 11 genomes of sizes up to 39 Mb, we generated high-quality assemblies from 80x coverage by paired 30-base simulated reads modeled after real Illumina-Solexa reads. The bacterial genomes of Campylobacter jejuni and Escherichia coli assemble optimally, yielding single perfect contigs, and larger genomes yield assemblies that are highly connected and accurate. Assemblies are presented in a graph form that retains intrinsic ambiguities such as those arising from polymorphism, thereby providing information that has been absent from previous genome assemblies. For both C. jejuni and E. coli, this assembly graph is a single edge encompassing the entire genome. Larger genomes produce more complicated graphs, but the vast majority of the bases in their assemblies are present in long edges that are nearly always perfect. We describe a general method for genome assembly that can be applied to all types of DNA sequence data, not only short read data, but also conventional sequence reads.
Year of Publication	2008
Journal	Genome Res
Volume	18
Issue	5
Pages	810-20
Date Published	2008 May
ISSN	1088-9051
URL	http://genome.cshlp.org/cgi/pmidlookup?view=long&pmid=18340039
DOI	10.1101/gr.7337908
PubMed ID	18340039
PubMed Central ID	PMC2336810
Links	Google Scholar PubMed DOI
Grant list	R01 HG003474 / HG / NHGRI NIH HHS / United States 5R01HG003474 / HG / NHGRI NIH HHS / United States