Tagged with #customwalker
0 documentation articles | 0 announcements | 3 forum discussions

No posts found with the requested search criteria.
No posts found with the requested search criteria.
Comments (4)

Hi fellow htsjdk/picard/gatk developers!

I've been thinking about this for quite some time now, so I thought I should write up a quick post about it here.

I've been writing custom tools for our group using both picard and GATK for some time now. It's been working nicely, but I have been missing a set of basic tutorials and examples, for users to quickly get started writing walkers. My most commonly used reference has been the 20-line life savers (http://www.slideshare.net/danbolser/20line-lifesavers-coding-simple-solutions-in-the-gatk) which is getting a bit dated.

What I would like to see is something like for following:

  • What's in htsjsk? What's not in htsjdk? (from a dev's perspective - in terms of frameworks)
  • What's in picard? What's not in picard? (from a dev's perspective - in terms of frameworks)
  • What's in gatk? What's not in gatk? (from a dev's perspective - in terms of frameworks)
  • When to use htsjdk, picard any GATK. What are the strengths and weaknesses of the three. (possibly more that I've missed)
  • Your first htsjdk walker
  • Your first picard walker
  • Your first gatk walker
  • Traversing a BAM in htsjdk vs gatk - what are the differences

There might be more stuff that could go in here as well. The driving force behind this is that I'm myself a bit confused by the overlap of these three packages/frameworks. I do understand that picard uses htsjdk, and that GATK uses both as dependencies, but it's not super clear what extra functionality (for a developer) is added from htsjdk -> picard -> gatk.

Could we assemble a small group of interested developers to contribute to this? We could set up a git repo with the examples and tutorials for easy collaboration and sharing online.

Anyone interested? I'll could myself as the first member :)

Comments (2)

I was wondering how other people package and manage any custom walkers they write? I wrote my first walker last week, and am now trying to figure out the best way to make it available to my team and keep it up to date with new versions of GATK. I do my development on a laptop and analysis on a cluster, so I have to sync the code somehow. I can think of two possibilities, it's unclear to me which is best:

Package a Monolithic jar
Pros: Makes distribution very easy, guarantees that there are never version compatibilities
Cons: Well, I haven't been able to do it yet. I can't quite figure out how to supply a package .xml file that will import my classes into GenomeAnalysisTK.jar (or Queue.jar, for that matter). Also, I think the version number will change when I customize it, and customers/collaborators may get confused about a non-standard version number (if they notice)

Create a "Custom Walker" jar
Pros: I know how to do it. It won't change the GATK version, so the log files won't look different
Cons: I'll have to remember to include my jar on the classpath (or set an environment variable, I suppose). It's possible to encounter version incompatibilities, for instance if I want to test 3.0 but use 2.8 for production

Now that I think it through, maybe the cons of the custom jar aren't as bad as I thought. Does anyone else have any experience with Custom Walker maintenance/deployment? How did you do it?

Comments (6)


I have developed a custom walker that I think could be useful to the community. Therefore, I'd like to distribute it.

I tried following the brief guide "Redistributing the GATK-Lite or distributing walkers", but the building fails:

/Users/dankle/Dropbox/IdeaProjects/gatk/build.xml:955: no resources specified

The command I run is ant clean && ant package -Dexecutable=MyWalker.jar, and the xml-file in packages is

<package name="MyWalker">
  <version file="StingText.properties" property="org.broadinstitute.sting.gatk.version" />
  <executable name="MyWalker">
    <main-class name="org.broadinstitute.sting.gatk.walkers.dk.MyWalker" />
    <resource-bundle file="StingText.properties" />
      <module file="GATKEngine.xml"/>
    <executable directory="/humgen/gsa-hpprojects/GATK/bin" symlink="current" />
    <archive directory="/humgen/gsa-hpprojects/GATK/bin" symlink="GenomeAnalysisTK-latest.tar.bz2" />

What am I doing wrong?