Hello I'm a developer in Korea. Recently, I have been developed about Bioinformatics pipeline. I'm using BWA, Samtools, Picard, GATK. And then I wanna make this tool on hadoop. The reason is why Using MR is efficient to speed or memory something like that. So, I know GATK is made by MR. If so, did you test GATK on MR? In theory, that is more efficient than just GATK.
And, If GATK needs indexed and sorted SAM, with using hadoop-BAM library do I just make index and sort??
Because I am novice in Bioinformatics, this issue is too complicated to me.
e-mail : firstname.lastname@example.org phone : +821027266808