AuDIT Documentation, v6  Print-icon

Description: Automated Detection of Inaccurate and Imprecise Transitions in MRM Mass Spectrometry

Author: D. R. Mani, The Broad Institute

Contact: gp-help@broadinstitute.org

Summary

Multiple reaction monitoring-mass spectrometry (MRM-MS) of peptides with stable isotope-labeled internal standards (SIS) is a quantitative assay for measuring proteins in complex biological matrices. These assays can be highly precise and quantitative, but the frequent occurrence of interferences require that MRM-MS data be manually reviewed by an expert, a time intensive process that is subject to human error. The AuDIT module implements an algorithm that, in an automated manner, identifies inaccurate transition data based on the presence of interfering signal or inconsistent recovery between replicate samples. 

The algorithm for Automated Detection of Inaccurate and imprecise Transitions (AuDIT) in SID-MRM-MS data greatly reduces the time required for manual, subjective inspection of data, improves the overall accuracy of data analysis, and is easily implemented into the standard data analysis workflow. AuDIT currently works with exported results from MRM-MS data processing software packages.

Algorithm

The algorithm objectively evaluates MRM-MS data using two orthogonal approaches. First, it compares the relative product ion intensities of the analyte peptide to those of the SIS peptide and uses a t-test (in conjunction with a pvalue threshold) to determine if they are significantly different. Then, coefficient of variation is calculated from the ratio of analyte and SIS peak areas from sample replicates, with those transitions having excessive variation flagged as being unsuitable. 
The algorithm for Automated Detection of Inaccurate and imprecise Transitions (AuDIT) in SID-MRM-MS data greatly reduces the time required for manual, subjective inspection of data, improves the overall accuracy of data analysis, and is easily implemented into the standard data analysis workflow. AuDIT currently works with exported results from MRM-MS data processing software packages.

References

Susan E Abbatiello, D. R. Mani, Hasmik Keshishian, Steven A. Carr. Automated Detection of Inaccurate and Imprecise Transitions in Quantitative Assays of Peptides by Multiple Reaction Monitoring Mass Spectrometry. Clinical Chemistry 56:2 (2010).

Parameters

Name Description
data file * input file with pre-processed MRM data in csv format
skyline export * if "yes", data file is a Skyline csv export including following columns: Sample (usually derived from SampleName, having identical value for all its replicates), PeptideSequence, ReplicateName, FragmentIon, PrecursorCharge, ProductCharge, light Area, heavy Area. When "no", input must contain sample, replicate, peptide, transition.id, area, IS.area in that order.
pvalue threshold * p-value threshold above which transition is quantification-worthy. Must be a value between 0 and 1 inclusive.
cv threshold * threshold for coefficient of variation below which transition is quantification-worthy
all pairs * calculate all possible pairs of relative ratios for a transition
output intermediate results * create files with intermediate results
output prefix * file name prefix used for output file(s)

* - required

Input Files

The input file specified by data file is a comma delimited (csv) file that is derived from software used to pre-process raw MRM-MS data. The preprocessing software could be vendor provided (e.g., MultiQuant from Applied Biosystems) or universal (e.g., Skyline from the MacCoss Lab, University of Washington). When the “skyline export” parameter is set to “no”, AuDIT is agnostic to the pre-processing software used, as long as the following columns are present in the input file, in the specified order: 
  1. Sample: The actual sample ID (excluding replicate notation). This is usually derived from the SampleName column output by MRM processing  software. Sample must be unique for different concentrations (if any), and must be the same for all replicates of that sample. In other words, for a given peptide and transition, the value in the Sample column must be identical for all the replicates. 
  2. Replicate: Replicate number for the Sample. 
  3. Peptide: Peptide name and/or sequence for the peptide that is being monitored. 
  4. Transition.ID: An indication of the transition being monitored. This may be a number or some other notation (e.g., b- or y-fragment number with charge state). While different peptides may have the same transition.id, these must be unique for a given peptide. 
  5. Area: Integrated peak area for the analyte for the specified peptide and transition. 
  6. IS.Area: Integrated peak area for the SIS for the specified peptide and transition. 
The input file must contain a column header and the names used in the column header must appear as listed above. 
When skyline export is set to “yes”, AuDIT assumes that the column naming used is Skyline specific. The following columns must be present: PeptideSequence, ReplicateName, FragmentIon, PrecursorCharge, ProductCharge, light Area, heavy Area. In addition, the Sample column must be derived from the Skyline SampleName column, to satisfy the conditions stated above (for skyline export = “no”). When the input data set is treated as a skyline export, the columns can be in any order (and interspersed with other columns), as long as they are present in the data set (with those exact names). When using N15, or other variants of reference standards, the appropriate light or heavy area column may need to be renamed. 
Notes: 

Output Files

The output file generated on successful completion of AuDIT has the following columns: 
  1. peptide: The Peptide column from the input data. 
  2. sample: The Sample column from the input data. 
  3. transition.id: The Transition.ID column from input data. For skyline exports, this will be a concatenation of the FragmentIon and PrecursorCharge columns. 
  4. pvalue.final: The multiple testing corrected t-test p-value for the transition under consideration. 
  5. status: This column is the result of applying the p-value threshold to pvalue.final, and is ‘good’ if pvalue.final > p-value threshold; the transition is marked ‘bad’ otherwise 
  6. cv: The calculated coefficient of variation for the replicates of this peptide/transition. 
  7. cv.status :Whether the CV is less than the CV threshold; ‘good’ if CV is less than threshold, ‘bad’ otherwise. 
  8. final.call: The final decision on whether the transition under consideration is imprecise or has interferences. The final.call is ‘good’ if both status and cv.status are ‘good’. If either status or cv.status is ‘bad’, final.call is ‘bad’. 

Example Data

Example input data from MultiQuant processed data is available in site56-7.1-multiquant-data.csv. Data from Skyline pre-processing is in site56-7.1-skylinedata.csv. Results from running AuDIT on these datasets are available in site56-7.1-multiquant-data-results.csv and site56-7.1-skyline-data-result.csv. The Skyline example data has been preprocessed to generic format, and should be invoked with skyline export set to “no”.

Platform Dependencies

Module Type: Proteomics
CPU Type: any
OS: any
Language: R 2.5

GenePattern Module Version Notes

VersionRelease DateDescription
62013-08-16Fixed bug which caused "arguments imply differing number of rows" error
52011-09-21Adjusted default values in manifest
42011-12-02Improved error for when no peptides with 3 or more transitions were found and added note to doc
32011-09-23Handles large dataset and data with different number of transitions for each peptide.
22010-04-20Improvements for handling data with missing values and data input validation
12009-12-19