Exact trimming

From ArachneWiki

(Redirected from Insert.sites)
Jump to: navigation, search

Exact trimming operates by looking for the last bit of sequencing vector that appears on the left-hand side of a read, just before the DNA from the organism being sequenced. If found, the vector is trimmed off. If not found, the read is passed on to the blast trimming.

Exact trimming is enabled by adding the file insert.sites to the DATA directory for your project. If this file does not exist, or is empty, only blast trimming is performed.

insert.sites is a text file, one tab-delimited entry per line, where each line contains the following information:

the sequencing center, should match the center field from the metainfo
insert size 
the size of the insert in bp
direction of sequencing, either F/R, should match the trace_end field from the metainfo
insert site 
the roughly 10 bases of vector, linker, etc. immediately adjacent to the DNA from the organism you're sequencing; i.e. TGTGGTGGAATTC
A switch depending on whether dimer-based sequencing is used (1) or not (0). Dimer-based sequencing is prone to occasionally having half of a dimer be blunt ended on both sides. The blunt-blunt half-dimer attaches itself before the read and causes all kinds of problems.
Personal tools