Tagged with #hadoop
0 documentation articles | 0 announcements | 4 forum discussions

No articles to display.

No articles to display.

Created 2013-12-10 06:25:55 | Updated | Tags: hadoop

Comments (4)

Hi, I am trying to run GATK tool on hadoop single node cluster. I have executed below command: hduser@ubuntu:~/apps/hadoop$ bin/hadoop jar GenomeAnalysisTK.jar -T RealignerTargetCreator -I /usr/hduser/gatkinput/exampleBAM.bam -R /usr/hduser/gatkinput/exampleFASTA.fasta -o output.list After executing the above command, I got exception which is attached in the file "'gatk error on hadoop VM.txt''. Please help me to resolve this issue.

Created 2013-06-06 21:08:30 | Updated | Tags: commandlinegatk queue hadoop mapreduce google

Comments (8)

Hello, I`m new to GATK and Queue. I understand that we can write a QScript in Queue to generate separate GATK jobs and run them on a cluster of several nodes. Can we implement GATK or Queue on google hadoop?

Created 2013-05-28 04:23:04 | Updated | Tags: hadoop mapreduce process improvement

Comments (1)

I try to change I/O framework rather than internal framework(existing MapReduce and so on). I did try to change other tool in I/O framework, and that was finished successfully. I think GATK also can be changed in I/O framework adding MapReduce. You said to rewrite executive. and traversals. Am I just rewrite that frameworks only? I think this project is related gatk.io.*. (Surely, I/O process expend all framework.)

I have been analyze framework, CommandLineGATK -> CommandLineExecutable->GenomeAnalysisEngine->OutputTracker->ArgumentSource || Storage,Stub..

Created 2013-05-15 08:23:58 | Updated | Tags: pipeline hadoop hadoop-bam

Comments (5)

Hello I'm a developer in Korea. Recently, I have been developed about Bioinformatics pipeline. I'm using BWA, Samtools, Picard, GATK. And then I wanna make this tool on hadoop. The reason is why Using MR is efficient to speed or memory something like that. So, I know GATK is made by MR. If so, did you test GATK on MR? In theory, that is more efficient than just GATK.

And, If GATK needs indexed and sorted SAM, with using hadoop-BAM library do I just make index and sort??

Because I am novice in Bioinformatics, this issue is too complicated to me.


e-mail : leoniz127@gmail.com phone : +821027266808