We’ve made some minor changes to DISCOVAR de novo output files to make them easier to use. See the manual or the Edge, lines and scaffolds primer for more details.
The assembly graph can be large, complex and unwieldy, so DISCOVAR de novo does not generate a viewable graph directly. Instead we have developed an interactive tool that allows you to explore your assembly by creating smaller viewable graphs of the regions you are interested in. This new tool, called NhoodInfo, is now part of the DISCOVAR package, as of release 50612 . It is also the engine behind our online demo, so you can try it out right now without having to create an assembly of your own. Full instructions on using NhoodInfo are included in the DISCOVAR package.
DISCOVAR takes as input read pairs from fragments of size 400-500 bp, with some larger and some smaller. The blog and manual contained references to fragments of size 700 bp, which were outdated, and have now been removed. Note that the protocol yields a wide size distribution, including some large fragments.
A DISCOVAR de novo assembly is a graph. A typical assembly consists almost entirely of linear stretches, typically like this
which we call ‘lines’, and providing a rich data type that captures polymorphism and other important features. Further, with some loss of information, these lines may be ‘flattened’ into standard contigs. We have added a tutorial explaining how these data types are available as part of the DISCOVAR output. We are also interested in hearing your thoughts regarding the utility of these output types and others that might be useful to you.