Unable to strand-check genotype likelihoods file, perhaps dimishing Beagle functionality
Posted in Ask the GATK team | Last updated on 2013-01-28 06:18:43


Comments (1)

Dear GATK team and community members,

I used ProduceBeagleInput to create a genotype likelihoods file, and ran beagle.jar according to the example in http://gatkforums.broadinstitute.org/discussion/43/interface-with-beagle-software. Beagle gave a warning that it is better to use a reference panel for imputing genotypes and phasing. So I downloaded the recommended reference panel (http://bochet.gcc.biostat.washington.edu/beagle/1000_Genomes.phase1_release_v3/), but Beagle requires that the alleles be in the same order on both reference and sample files. The tool to do this is check_strands.py (http://faculty.washington.edu/sguy/beagle/strand_switching/README), but it requires both sample and reference files be in .bgl format. This is a little disappointing since not being able to use the reference panel means Beagle's calculations won't be as accurate, although I'm not sure by how much.

I understand that this might be out of the scope of responsibility for the GATK team, but I will greatly appreciate if someone can provide suggestions to allow GATK's input to Beagle be phased using a reference panel. Or hopefully, the GATK team will write a tool to produce .bgl files?

Regards, Jamie


Return to top Comment on this article in the forum