It’s an exciting week here at GATK HQ because we finally get to tell you about this very cool project we’ve been working on: today, Broad and Google announced plans to make GATK available as a service on Google Cloud Platform as part of Google Genomics!
Our goal with this project is to enable researchers around the world to run the latest version of the GATK Best Practices pipeline without the hassle of figuring out the plumbing or storing huge amounts of genomic data locally. The initial alpha rollout will be limited to a small set of users. Based on our experience supporting the GATK user community, we think this will dramatically lower the barriers to top-notch genomic analysis capabilities, especially for researchers who don’t have access to the kind of dedicated compute infrastructure and engineering teams required for analyzing genomic data at scale.
Rest assured, if you’re the do-it-yourselfer type, the DIY GATK option is still here; you’ll still be able to download the latest version of GATK from the Broad website, set it up on your own system and run it locally. We will continue documenting and supporting this option with the same level of dedication as ever.
You can read more about the details in the Broad Institute press release and on the Google Cloud Platform blog. Feel free to ask questions in the comments thread of course. We’ll keep you posted here as we move forward.
Hello, I`m new to GATK and Queue. I understand that we can write a QScript in Queue to generate separate GATK jobs and run them on a cluster of several nodes. Can we implement GATK or Queue on google hadoop?