Assessing assembly methods

For purposes of assessing our assemblies and variant calls, we generated some NA12878 clone reference sequences.  We believe that these data will be of interest to the community and have therefore decided to make them available to all. These clone sequences and the raw data used to generate them can be found on our FTP site.

The sequences were obtained by randomly selecting ~100 clones from an NA12878 Fosmid library.  Two pools of ~50 each were created, then sequenced by MiSeq (250 bases) and PacBio (~3000 bases).  There are also some jumps.

We completely assembled 103 clones, without ambiguity, in some cases with manual intervention.  Cloning vector has been removed.  There are a small number of additional clones in the pools, not included in the assemblies, including a few that had low coverage, some EBV, and some centromeric sequence.

This is version 1.0 of the set.  We believe that the error rate on the clones is very low, however we are carrying out laboratory validation and will roll out updated versions as the results come back.

This work is supported by NHGRI grants.

New DISCOVAR release

A new release (r46399) of DISCOVAR is now available. It contains the following changes:

- More robust SAMtools version checking.
- Improvements to .variant file format.
- MALLOC_PER_THREAD = 1 environment setting no longer mandatory. However setting this may give a significant performance boost.
- Various bug fixes.

Thanks to all the users who have brought these problems to our attention.

DISCOVAR has been released!

We are pleased to announce that DISCOVAR is now available to download .

DISCOVAR is a variant caller and genome assembler from the Broad Institute. It uses the latest low cost sequencing data, and can generate highly accurate variant calls for individual humans, or assemble small genomes de novo (with support for large genomes to follow). We expect it will be particularly valuable for understanding human Mendelian disease, but equally suited to investigating the biology of other organisms.

Find out more about DISCOVAR, and please check out the FAQ and help too.