You are here

Bioinformatics DOI:10.1093/bioinformatics/btq102

Genome-wide synteny through highly sensitive sequence alignment: Satsuma.

Publication TypeJournal Article
Year of Publication2010
AuthorsGrabherr, MG, Russell, P, Meyer, M, Mauceli, E, Alföldi, J, di Palma, F, Lindblad-Toh, K
JournalBioinformatics
Volume26
Issue9
Pages1145-51
Date Published2010 May 01
ISSN1367-4811
KeywordsAlgorithms, Animals, Computational Biology, Fourier Analysis, Genome, Genomics, Humans, Models, Statistical, Oryza, Probability, Programming Languages, Sequence Alignment, Software, Sorghum, Tetraodontiformes
Abstract

MOTIVATION: Comparative genomics heavily relies on alignments of large and often complex DNA sequences. From an engineering perspective, the problem here is to provide maximum sensitivity (to find all there is to find), specificity (to only find real homology) and speed (to accommodate the billions of base pairs of vertebrate genomes).

RESULTS: Satsuma addresses all three issues through novel strategies: (i) cross-correlation, implemented via fast Fourier transform; (ii) a match scoring scheme that eliminates almost all false hits; and (iii) an asynchronous 'battleship'-like search that allows for aligning two entire fish genomes (470 and 217 Mb) in 120 CPU hours using 15 processors on a single machine.

AVAILABILITY: Satsuma is part of the Spines software package, implemented in C++ on Linux. The latest version of Spines can be freely downloaded under the LGPL license from http://www.broadinstitute.org/science/programs/genome-biology/spines/.

URLhttp://bioinformatics.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=20208069
DOI10.1093/bioinformatics/btq102
Pubmed

http://www.ncbi.nlm.nih.gov/pubmed/20208069?dopt=Abstract

Alternate JournalBioinformatics
PubMed ID20208069
PubMed Central IDPMC2859124
Grant ListU54 HG003067 / HG / NHGRI NIH HHS / United States
1 U54 HG03067 / HG / NHGRI NIH HHS / United States