Frequently Asked Questions

What is SNAP?

SNAP is a computer program and web-based service for the rapid retrieval of linkage disequilibrium proxy SNP results given input of one or more query SNPs and based on empirical observations from the International HapMap Project and the 1000 Genomes Project. A series of filters allow users to optionally retrieve results that are limited to specific combinations of genotyping platforms, above specified pairwise r2 thresholds, or up to a maximum distance between query and proxy SNPs. Most questions regarding the implementation, associated terms and details of SNAP can be answered by referring to the Documentation page.

How can I run SNAP locally on my own computer?

A local option for SNAP is not currently planned or available.

Why is the maximum distance limit 500 kb?

Pairwise LD calculations in the HapMap were done only for SNP pairs within 500 kb of one another. Although long range LD is certainly observed in some regions of the genome in most cases the LD signal decays substantially within 500 kb.

When I run SNAP with release 21 and release 22 I get different results. Why is this?

The SNPs included in HapMap releases 21 and 22 are slightly different for a variety of reasons. In general we suggest relying on release 22 unless release 21 is specifically required. This is because release 22 uses a new genome build (hg18), and includes more overlap with the commercially available genotyping arrays. Please note that Chromosome X proxies are not available in release 22. The distances between query and proxy SNPs may differ in release 21 and 22 results because the rely on SNP coordinates from different builds of the human genome (hg17 and hg18, respectively). Due to differences in sequencing (e.g., gap fill-ins) the relative positions of SNPs as we understand them can shift across genome builds.

There is no filter available for the commercial genotyping array I want. Why not?

SNAP includes most but not all commercial genetic mapping arrays. Realizing that new products are currently being designed and released SNAP is implemented so that new array filters can easily be added in future releases of the tool. Thus, once marker lists are finalized for newer products we will add them in an updated version. If there are additional legacy arrays you would like to see included because of a specific need, feel free to contact us with the necessary details and we will consider adding them in a future release.

Why are some arrays listed twice, for example the Illumina Human1M?

Illumina arrays are produced in different configurations and these different configurations sometimes contain slightly different lists of SNPs. The Illumina Human1M, for example, comes in both a single sample and dual sample configuration. There are slightly different SNP lists on the two products.

I got Warning and Error messages in my output. What do they mean? Can I get rid of them?

If you would like to eliminate messages in your output, re-run your query after checking the box "Suppress warning messages in output."

I see different query or proxy SNPids than I expected in my results. Why is this?

As dbSNP grew, there were SNPs known by two or more IDs due to separate submissions of information pertaining to the same SNP. Thus, with subsequent dbSNP builds overlapping "alias IDs" have been merged into the current SNPids. SNAP incorporates knowledge of SNP alias IDs for query and proxy SNPs into the search strategy. Thus, it is possible that if there is an alias ID for your query SNP the HapMap SNPid that is returned will not be the same. The query column always contains the SNPid you used as input. The proxy column always contains the HapMap SNPid for the chosen HapMap release.

I see there no array information for some or all of the proxy SNPs in my output. Why is this?

The vast majority of SNPs on commercial arrays have been genotyped as part of the HapMap project. However, the HapMap project encompasses many more SNPs that do not currently appear on commercial arrays. If you only want to return SNP proxies on commercial arrays then select specific arrays, or click "Select All".

Why did you choose to include only an r2 threshold and not one for D' as well?

r2 is generally preferred to D' in association study mapping and shows less inflation than D' in smaller population surveys.

What is the color scheme used on the SNAP LD plots?

Each panel is identified by a different color scheme: CEU is red, YRI is green and JPT+CHB is blue. The brightness of each point is proportional to the r2 value for that SNP. The color is determined by the R hsv function:
 color = hsv(baseColor, r2, 1.0)

What is the color scheme used on SNAP association plots?

SNAP association plots use a combination of shape and color to provide information about each SNP. Genotyped SNPs and imputed SNPs are shown as diamonds. Other kinds of SNPs (including those of unknown type) are shown as squares. The color is determined by the R hsv function:
 color = hsv(0, r2, 1.0)

How are sex chromosomes handled?

The LD data used by SNAP is based on phased genotype data from HapMap and the 1000 Genomes Project. Phased genotypes are available for chrX for HapMap Release 21 and 1000 Genomes Pilot 1, but not for Release 22 or the HapMap3_r2 release. As a result, SNAP will not find proxy snps on chrX or include chrX snps on plots when using these data sets. SNAP does not have LD data for chrY for any data set.

How and when should I cite SNAP?

If you use SNAP in the design, analysis or implementation stages of any project we appreciate it if you cite. [Citation information]