Inbred Laboratory Mouse Haplotype Map

Last update: February 2009
 
We are pleased to make an updated release of genotype data from the haplotype map of the inbred mouse project being led by the Daly Laboratory at The Broad Institute of Harvard and MIT and the Massachusetts General Hospital in collaboration with The Jackson Laboratories and with funding from the National Human Genome Research Institute.

Complete descriptions of the data and analyses can be found in (manuscript reference to be added).

Downloadable files

Genotypes

The below file is a gzip'd tab-delimited text file which contains the post-QC Affymetrix dataset merged with the Wellcome-CTC Mouse Strain SNP Genotype Set (the large fraction of it which placed well to the NCBI-build-37 assembly in our hands) for jointly typed strains.  Each SNP appears as a single line, and columns in the file are:
        1) SNP name/location according to the NCBI-build-37 mouse assembly (mm37-chr-position)
        2) Wellcome Trust marker name, if any
        3+) genotype data for each strain in the form of "genotype:affy-confidence-score" (0 is the best confidence score)
All hapmap/WT-discordant genotypes were removed
Only homozygous WT calls were considered for well-placed build-37 SNPs
Only high-quality (uppercase) genotypes from WT's "extra" typing were used
Confidence-score special cases:
        * (asterisk after Affymetrix confidence), means genotype confirmed with WT genotype
        -1 .. a discordancy-based NoCall
        -2 .. a WT call which has no Affy confidence score
        -3 .. a NoCall for an WT-only marker.

Merged_Hapmap_WT_Genotypes

List of typed strains

Annotated list of well-behaved SNPs on NspI and StyI chips

The below file is a gzip'd tab-delimited text file listing those well-behaved SNPs on the two Affy mouse arrays.  Each SNP appears as a single line for each mouse array on which it is well behaved.  Columns in the file are:
        1) SNP name on Affy chip, which is the NCBI-build-33 assembly location (eg. mm33-1-123456)
        2) enzyme - N = NspI, S = StyI
        3) NCBI-build-37 assembly location
        4) A-to-build-37-F-strand - the build-37-assembly, forward-strand-oriented base which corresponds to the Affy "A" allele
        5) B-to-build-37-F-strand - the build-37-assembly, forward-strand-oriented base which corresponds to the Affy "B" allele
        6) C57BL/6J-build-37-F-strand call

Good_SNPEnzyme_List

Flanking sequence and NCBI build 33 -> NCBI build 37 SNP mappings

The Affymetrix mouse chips were designed using the 2004 mouse assembly (NCBI build 33).  SNPs on the chips are named according to their build-33 location (mm33-chr-base_position).  The below downloadable file contains information regarding the mapping of the SNPs to the NCBI-build-37 assembly and also flanking sequence for SNPs based on the build-37 assembly.

The file is a gzip'd tab-delimited text file, with each SNP as a row, and with the following columns:
        1) SNP name on Affy chip, which is the 2004-assembly location (eg. mm33-1-123456)
        2) B6(assembly)-allele relative to +-strand of 2004 assembly
        3) alternate-allele relative to +-strand of 2004 assembly
        All the following will be "N/A" if the SNP doesn't map well to NCBI build 37:
            4) NCBI-build-37 assembly location (eg. mm37-1-123456)
            5) B6(assembly)-allele relative to +-strand of NCBI build-37 assembly
            6) alternate-allele relative to +-strand of NCBI build-37 assembly
            7) NCBI build-37 discovery has different alleles relative to that of NCBI build-33? .. Y/N
            8) NCBI build-37 discovery is rev cmp'd relative to that of NCBI build-33? .. Y/N ("N/A" if prev column is "Y")
            9) local sequence change relative to NCBI build-33 within 16 bases of the SNP (Affy probe footprint)? .. Y/N
            10) passes QC and NCBI-build-37-well-mapped/behaved for at least one enzyme chip (i.e. THE GOOD ONES) .. Y/N
        11) number of called strains, relative to original 94-strain typing
        12) average Affy DM confidence ("N/A" if previous column is 0)
        13) flanking sequence (bracketed-SNP notation with B6 allele as the numerator, 500 flanking bases to each side, sequence is soft repeatmasked, nearby known SNPs are N'd out)

Flanks_And_Mappings

Useful links

Previous data release
High-density resequencing genotypes and imputation of non-resequenced strains
High-density resequencing data generation

Acknowledgements

Broad Institute of Harvard and MIT
University of California, Los Angeles
Harvard Medical School
Massachusetts General Hospital
Jackson Laboratories (Mouse Phenome Project)
Affymetrix
Perlegen Sciences
NHGRI
NIEHS
Wellcome Trust Center of Human Genetics