Looking at Assemblies

From ArachneWiki

Jump to: navigation, search

After you have run Arachne and created a draft assembly, you will want to look at your assembly. It will take a detailed understanding of your assembly before you can go about improving it. Hence Arachne provides several tools for assembly analysis. The following modules generally require a SUBDIR command-line argument, specifying the assembly location. Also note that the RUN directory, in the context of these modules, should include /work.

Text output modules

Arachne stores its data in a binary format, so you cannot examine your assembly just by looking at the first output files. However, Arachne provides text output modules, which, as the name suggests, read the binary output and produce human-readable text files.

The modules BasicStats and BasicAssemblyStats print out vital statistics of an assembly, such as the contig N50, scaffold N50, the read coverage, and so on.

The module FastbStats gives statistics about a fastb file.

The module VerifyAssemblyIntegrity provides a "sanity check" to make sure nothing is horribly wrong with your assembly.

The module LibStatsOverview provides useful statistics on the reads and inserts in each library.

The module RunMarkup provides a thorough analysis of repeats.

The module PolymorphismEstimator estimates the heterozygosity for polymorphic genomes.

The module GenerateTilings writes out an entire contig, base by base, and lines it up against the input reads that contributed to it. It writes its output in the Tilings subdirectory of SUBDIR. (GenerateTilings may take a long time to run!) A similar module is CreateAceFile, which produces ace files that can be read by Consed and look very much like the output of GenerateTilings.

Visual output modules

Visual output modules allow you to inspect your assembly visually. The primary visual output module is DisplaySupercontig.

The modules SuperDotPlot and FastbDotPlot allow you to compare two different supercontigs with a dot plot.

Personal tools