ReadsToAligns

From ArachneWiki

Jump to: navigation, search
ReadsToAligns
Function Alignment
Phase Pre-processing
Standard CLAs PRE, DATA, RUN, GDB, NO_HEADER
Special CLAs <various>
Source location ARACHNE_DIR/assemble


The ReadsToAligns modules (ReadsToAligns1, ReadsToAligns2, and ReadsToAligns3) are three pre-processing modules that generate read-read alignments, thereby performing the overlap phase of the assembly process. They divide the work between them as follows:

  • ReadsToAligns1: Considers read pairs and attempts to align them. This is time-intensive, although it contains several heuristics to avoid examining every read pairwise -- most notably dividing the reads into piles and examining only read pairs within a single pile. Uses the MakeAligns module and creates the aligns.pile files. In addition, since the number of reads can be very large, a pre-selection step searches for perfect kmer matches between all reads and uses these to perform real read-to-read alignments. Repetitive kmers (as marked by TagRepeatReads) are not used for this seeding, because they would result in too many kmer matches; note that this prevents repetitive regions of the genome from being mapped out.
  • ReadstoAligns2: Refines read-read alignments by adding in reverse complements, remediating, and removing bad alignments. Reads the aligns.pile files and creates aligns.total2.

In module pipelines, the ReadsToAligns modules should always be run together. It is a good idea to follow up with additional alignment-pruning modules, such as EraseImproperAligns, TidyAligns, and CleanAlignments.

Personal tools