Information about Haploview's input file formats can be found in the files section of the user manual. It might also be useful to look at the sample files in the downloads section.
How is my pedigree structure being used?
What does "Not a member of maximum unrelated subset" mean?
Family structures are parsed using an algorithm which extracts the maximum utility preferentially phased subset, defined as follows. Find the maximum set of unrelated individuals, optimizing for number of family trios (for obligate phasing) and fraction of successfully genotyped markers. Individuals who have sufficient genotypes and unbroken pedigrees but who are not members of this maximum set of unrelated individuals usable for LD purposes are marked as "Not a member of maximum unrelated subset." We use this subset for all LD, haplotype frequency and haplotype association calculations. It consists of 2 chromosomes from each singleton and four(4) chromosomes from each founder trio.
A different set is used for performing single marker association. For case/control analysis we use all singletons who have a non-zero affectation status. For TDT we use any transmission, that is, any parent/parent/affected-offspring trio with valid genotypes is tested. This means that multiple affected siblings and multiple generations with affected parents will be used.
Why are my individuals being filtered out?
Haploview removes individuals if they have greater than a certain fraction of missing genotypes. The default for this setting is 50%, but it can be adjusted when loading a file, or from the command line with the"-missingcutoff" option. This is important to keep in mind especially for datasets where some individuals have been genotyped on an incomplete subset of the total marker set.
Individuals with only one parent (i.e. they have one valid parent ID and "0" for the other parent) are currently not supported. For this reason Haploview excludes families containing such individuals. We hope to eventually add more sensible handling of such situations, but for now you can force such individuals in by creating a fake placeholder individual to use as the missing parent and set all his or her genotypes to zero.
How can I allow Haploview to use more memory for big datasets?
Haploview allocates 512M of memory by default. This is usually sufficient to handle datasets with several thousand markers. If you are running the program on very large datasets (>20,000 markers) you may need to force more memory (presuming your computer has sufficient resources available). This can be accomplished using the following command:
java -jar Haploview.jar -memory 2000
Where "2000" in this case specifies 2000 megabytes (2G) of memory and can be adjusted as necessary. Older versions of Haploview required the following command, which still works:
Unfortunately, our current implementation of Haploview can't handle "marriage loops" in the pedigree files. If members of one of your families are connected by more than one marriage (e.g. two siblings from one family marry two siblings from another family) this will cause a problem. You can get your files to work by removing the offending individuals. We hope to have a fix for this issue soon.
Can I use Haploview with data for unrelated individuals?
Yes. Detailed information about constructing a linkage style input format for studies with unrelated individuals can be found at the bottom of the discussion of the linkage format.