You are here

PLoS One DOI:10.1371/journal.pone.0112963

Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement.

Publication TypeJournal Article
Year of Publication2014
AuthorsWalker, BJ, Abeel, T, Shea, T, Priest, M, Abouelliel, A, Sakthikumar, S, Cuomo, CA, Zeng, Q, Wortman, J, Young, SK, Earl, AM
JournalPLoS One
Date Published2014
KeywordsAlgorithms, Bacteria, Genetic Variation, Genome, Bacterial, Molecular Sequence Data, Sequence Analysis, DNA, Software

Advances in modern sequencing technologies allow us to generate sufficient data to analyze hundreds of bacterial genomes from a single machine in a single day. This potential for sequencing massive numbers of genomes calls for fully automated methods to produce high-quality assemblies and variant calls. We introduce Pilon, a fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions. Pilon works with many types of sequence data, but is particularly strong when supplied with paired end data from two Illumina libraries with small e.g., 180 bp and large e.g., 3-5 Kb inserts. Pilon significantly improves draft genome assemblies by correcting bases, fixing mis-assemblies and filling gaps. For both haploid and diploid genomes, Pilon produces more contiguous genomes with fewer errors, enabling identification of more biologically relevant genes. Furthermore, Pilon identifies small variants with high accuracy as compared to state-of-the-art tools and is unique in its ability to accurately identify large sequence variants including duplications and resolve large insertions. Pilon is being used to improve the assemblies of thousands of new genomes and to identify variants from thousands of clinically relevant bacterial strains. Pilon is freely available as open source software.


Alternate JournalPLoS ONE
PubMed ID25409509
PubMed Central IDPMC4237348
Grant ListHHSN272200900018C / AI / NIAID NIH HHS / United States
U19 AI110818 / AI / NIAID NIH HHS / United States
HHSN272200900018C / / PHS HHS / United States
U54HG003067 / HG / NHGRI NIH HHS / United States