Tagged with #gsalib
1 documentation article | 0 events or announcements | 1 forum discussion


A GATKReport is simply a text document that contains well-formatted, easy to read representation of some tabular data. Many GATK tools output their results as GATKReports, so it's important to understand how they are formatted and how you can use them in further analyses.

Here's a simple example:

#:GATKReport.v1.0:2
#:GATKTable:true:2:9:%.18E:%.15f:;
#:GATKTable:ErrorRatePerCycle:The error rate per sequenced position in the reads
cycle  errorrate.61PA8.7         qualavg.61PA8.7                                         
0      7.451835696110506E-3      25.474613284804366                                      
1      2.362777171937477E-3      29.844949954504095                                      
2      9.087604507451836E-4      32.875909752547310
3      5.452562704471102E-4      34.498999090081895                                      
4      9.087604507451836E-4      35.148316651501370                                       
5      5.452562704471102E-4      36.072234352256190                                       
6      5.452562704471102E-4      36.121724890829700                                        
7      5.452562704471102E-4      36.191048034934500                                        
8      5.452562704471102E-4      36.003457059679770                                       

#:GATKTable:false:2:3:%s:%c:;
#:GATKTable:TableName:Description
key    column
1:1000  T 
1:1001  A 
1:1002  C 

This report contains two individual GATK report tables. Every table begins with a header for its metadata and then a header for its name and description. The next row contains the column names followed by the data.

We provide an R library called gsalib that allows you to load GATKReport files into R for further analysis. Here are the five simple steps to getting gsalib, installing it and loading a report.

1. Get the GATK source code on GitHub

Please visit the Downloads page for instructions.

2. Compile the gsalib library

$ ant gsalib
Buildfile: build.xml

gsalib:
     [exec] * installing *source* package ?gsalib? ...
     [exec] ** R
     [exec] ** data
     [exec] ** preparing package for lazy loading
     [exec] ** help
     [exec] *** installing help indices
     [exec] ** building package indices ...
     [exec] ** testing if installed package can be loaded
     [exec] 
     [exec] * DONE (gsalib)

BUILD SUCCESSFUL

3. Tell R where to find the gsalib library by adding the path in your ~/.Rprofile (you may need to create this file if it doesn't exist)

$ cat .Rprofile 
.libPaths("/path/to/Sting/R/")

4. Start R and load the gsalib library

$ R

R version 2.11.0 (2010-04-22)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(gsalib)

5. Finally, load the GATKReport file and have fun

> d = gsa.read.gatkreport("/path/to/my.gatkreport")
> summary(d)
              Length Class      Mode
CountVariants 27     data.frame list
CompOverlap   13     data.frame list
Sorry, there are no publicly available documents of this type with the tag #gsalib. Try one of the other types.

Hi,

Does anyone use GATK R Library (gsalib)? I follow the guide at this link (http://gatkforums.broadinstitute.org/discussion/1244/what-is-a-gatkreport), when I type "ant gsalib" in my linux system with Java 1.6, the building failed. Here is the error message.

5:00pm qyu@vbronze /bit/data01/projects/prod_scripts/GATK/gatk-master $ ant gsalib Buildfile: build.xml

BUILD FAILED /bit/data01/projects/prod_scripts/GATK/gatk-master/build.xml:184: The type doesn't support the "erroronmissingdir" attribute.

Can anyone know what is wrong here?

Many thanks, Qing