- For the DNA object, see Sequence.
Sequencing or DNA sequencing is the act of analyzing genomic DNA, determining the order of bases, and creating reads. Sequencing methods are laboratory-based procedures to perform sequencing. Each sequencing method creates reads of different lengths, qualities, and linking properties, which necessitates the development of specific algorithms to handle libraries of that type of read. Without sequencing, Whole Genome Shotgun Assembly (WGA) would be entirely impossible.
The first sequencing methods, developed in the 1970's, constituted a breakthrough in genomics. The advent of Sanger sequencing paved the way for massive sequencing projects, such as the Human Genome Project, and kick-started the nascent fields of bioinformatics and computational biology. Sanger reads are ideal for WGA, but computational usefulness is not the only criterion by which technologies are judged. Sanger sequencing requires expensive machinery and is very time-consuming, especially given the huge number of reads that must be generated. In recent years, researchers have begun to explore alternative sequencing methods that require less expenditure of time and money.
Sanger sequencing is the gold standard in read production. The technique was developed by biochemist Frederick Sanger, earning him the 1980 Nobel Prize in Chemistry. Newer technologies show promise but have yet to eclipse Sanger sequencing in usefulness. ABI reads are a type of Sanger reads; they are named after the company (Applied Biosystems) that used Sanger sequencing to create reads for the Human Genome Project.
The biological steps to Sanger sequencing are as follows: Reads are created from inserts, longer sequences of DNA that have been inserted into a vector organism such as E. coli. Insert sequences are fed into a machine that reads the end of the sequence and reports on the base pairs it sees (hence the term "read".) The machine uses a program such as Phred. The resulting Sanger reads are typically 600-700 bp long and paired. For an online demonstration, go here.
454 sequencing is a next-generation technology developed by 454 Life Sciences (http://www.454.com/). 454 reads are typically 150-200 bp long (though their lengths are increasing) and unpaired. They lower quality scores than Sanger reads. The term is read and spoken as "four-five-four" (rather than as a three-digit number, "four hundred fifty-four"). The 454 sequencing methodology is susceptible to homopolymer sequencing errors.
Solexa sequencing is a next-generation technology that shows a lot of promise. Solexa sequencing is high-throughput, resulting in an extremely large number of reads for a small investment of time and money, but Solexa reads are only (exactly) 36 bp. For an online demonstration of Solexa sequencing, go here.
Solexa was developed independently until November 2006, when it was bought by Illumina, Inc. (http://www.illumina.com/), which continues to develop it.
SOLiD sequencing is a next-generation technology developed by Applied Biosystems. The name is an acronym for Supported Oligo Ligation Detection.