The assembly process is the process by which an assembly is put together. Sometimes the assembly process is itself referred to as the "assembly", which is not entirely correct.
The assembly process is the central task of the Arachne program, in which reads are combined to form contigs, then supercontigs (or scaffolds), and eventually a draft final assembly. It cannot take place until after pre-processing has been performed.
Each step in the assembly process is undertaken by an assembly module. A given assembly process can be thought of as a long pipeline of assembly modules in series, with the pre-processed input at the beginning and a final assembly at the end. The pre-processed input is in RUN, and a new SUBDIR is created every time an assembly module is run. Script modules such as Assemblez provide ready-made pipelines of assembly modules.
The first three phases in the assembly process are not unique to Arachne; they have been used in earlier assemblers such as PHRAP.
- Overlap: The formation of read-read alignments. Performed in Arachne as part of pre-processing, by the ReadsToAligns modules.
- Layout: The grouping of reads based on their alignments, and relative positioning of the reads within their groups. Performed in Assemblator.
- Consensus: The creation of a contig from a group of aligned reads. So called because each base in the contig is chosen by a consensus opinion of the reads there.