Error correction

Error correction is a pre-processing algorithm in which reads are modified to remove likely sequencing errors. Error correction relies on the results of the overlap process, in which read-read alignments are created. If a large number of reads are aligned together at a particular location, and one read disagrees with a strong majority of the others, the odd read is likely to contain a sequencing error. (On the other hand, that base may be a SNP.) Error correction is much more effective at higher coverage.

A secondary advantage of error correction is that it reduces the amount of space needed to store read-to-read overlaps (in the file aligns.total2.)

The Arachne modules that handle error correction are named, appropriately, CorrectErrors1 and CorrectErrors2.

