The DBSNP rod
From GSA
Now that dbSNP distributes dbSNP in VCF format, please use this instead of the old UCSC-style ROD file.
dbSNP rods against b36, hg18, and b37
Are part of the GATK resource bundle.
The old way of creating dbSNP rods: NO LONGER USED
The hg18 dbSNP Reference Ordered Data (ROD) file is a text dump of the DBSNP database:
http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/snp129.txt.gz
Which is in the following format:
We use a script (see below) to clean up this file a bit by removing the _random and _hap contigs, and then sorts this file using the sortByRef.pl script.
# make the hg18 sorted dbSNP record
# note that we assume header/comment/annotation lines (starting with '#') have been removed from the file already
cat /seq/references/dbsnp/downloads/snp129_hg18.txt | awk '($2 !~ "_random") && ($2 !~ "_hap")' > tmp.dbsnp.txt
~/dev/GenomeAnalysisTK/trunk/perl/sortByRef.pl --k 2 tmp.dbsnp.txt /seq/references/Homo_sapiens_assembly18/v0/Homo_sapiens_assembly18.fasta.fai > dbsnp_129_hg18.rod
rm tmp.dbsnp.txt
# dealing with 1KG reference
# We derived our dbSNP file from the hg18 file you list below using a script that renames chrA to A and chrM to MT and
# removes the *_random and *_hap contigs, since we weren't sure how to map the _random and_hap contigs onto the b36 build
# NT_* and NC_* contigs. Finally we sort the file into the b36 order.
cat /seq/references/dbsnp/downloads/snp129_hg18.txt | awk '$2 !~ "_random" {print}' | awk '$2 !~ "_hap" {print}' | awk '{sub(/chrM/, "chrMT", $0); sub(/chr/, "", $0); print}' > tmp.dbsnp.txt
~/dev/GenomeAnalysisTK/trunk/perl/sortByRef.pl --k 2 tmp.dbsnp.txt /broad/1KG/reference/human_b36_both.fasta.fai > dbsnp_129_b36.rod
rm tmp.dbsnp.txt