Getting Started

Expand All | Collapse All

WARNING: This documentation is preliminary and still under development!

1.0: Overview

Pilon is a software tool which can be used to:

  • Automatically improve draft assemblies
  • Find variation among strains, including large event detection

Pilon requires as input a FASTA file of the genome along with one or more BAM files of reads aligned to the input FASTA file. Pilon uses read alignment analysis to identify inconsistencies between the input genome and the evidence in the reads.  It then attempts to make improvements to the input genome, including:

  • Single base differences
  • Small indels
  • Larger indel or block substitution events
  • Gap filling
  • Identification of local misassemblies, including optional opening of new gaps

Pilon then outputs a FASTA file containing an improved representation of the genome from the read data and an optional VCF file detailing variation seen between the read data and the input genome.  To aid manual inspection and improvement by an analyst, Pilon can optionally produce tracks that can be displayed in genome viewers such as IGV and GenomeView, and it reports other events (such as possible large collapsed repeat regions) in its standard output.

1.1: System Requirements

  • Java runtime 1.6 or later
  • 8GB or more memory to allocate to the JVM. The amount of memory required depends on the genome, the read data, and how many fixes Pilon needs to make.  Generally, bacterial genomes with ~200x of Illumina coverage will require at least 8GB, though 16GB is recommended.

1.2: Installation

Pilon is distributed as a single jar file.  Download the jar and run with a command such as:

java -Xmx16G -jar /path/to/pilon.jar --genome assembly.fasta --frags frags.bam --jumps jumps.bam

1.3: Next Steps

Learn how to use Pilon by visiting the Pilon documentation: