Mascot Server (since version 2.2) supports five labelled and label-free quantitation protocols. The protocols are Reporter, Precursor, Multiplex, Replicate and Average, and they are described in full on the Matrix Science website. Reporter and Multiplex work off of reporter ions or fragment peaks in the MS/MS spectrum, which means peptide and protein ratios can be calculated directly from the results file. Precursor, Replicate and Average require additional information from the raw data file.
In Mascot 2.2, 2.3 and 2.4, quantitation ratios for Reporter and Multiplex were calculated by the Mascot report scripts. In Mascot 2.5, and Parser 2.5, the calculation has been integrated as part of the Parser results file interface. The interface has some support for Precursor, Replicate and Average quantitation data as well, although you will need to extract component intensities from the raw data first, for example using Mascot Distiller.
If the results file has been downloaded from the Mascot Server using the cgi/client.pl
script with the result_file_mime2
parameter, then note that the results file will not contain the query sections that contain the ms-ms data, so it won't be possible to get quantitation ratios for reporter and multiplex data.
Mascot also supports the Exponentially Modified Protein Abundance Index (emPAI), which is an approximate, relative quantitation method based on protein sequence coverage. In Mascot 2.2, 2.3 and 2.4, emPAI calculation was part of the Mascot report scripts. In Mascot 2.5 and Parser 2.5, the calculation is implemented in ms_mascotresults::getProteinEmPAI(). Since emPAI uses only protein sequence coverage, not MS/MS or raw data, it is essentially "always on" when the search results contain enough input queries, and no special initialisation needs to be done. The rest of the document describes the five other quantitation protocols. Please refer to the documentation for ms_mascotresults::getProteinEmPAI() for more details on emPAI.
If you have a Mascot results file with Reporter or Multiplex data, and you used the correct quantitation method when submitting the search, then it is straightforward to extract peptide and protein ratios from the file using ms_ms2quantitation. Open the results file using ms_mascotresfile
and ms_peptidesummary
as described in Getting protein information; let the objects be resfile
and pepsum
, respectively. The following example code extracts component intensities and calculate peptide ratios in a single step, using the default quantitation method parameters:
quant = msparser.ms_ms2quantitation(pepsum); if not quant.isValid(): # report quant.getLastErrorString() return # Data is now available in quant
(Small examples such as the one above are given in Python. A more complete example is given in other programming languages at the end of the section.)
ms_ms2quantitation
supports three kinds of normalisation: the normalisation constants can be calculated from the set of all peptide ratios, a set of peptide sequences (peptide ratios associated with the sequences) or a set of protein accessions (peptide ratios associated with the accessions). The normalisation constant for each report ratio, say 115/114 in an iTRAQ file, is the median or geometric mean of the selected 115/114 peptide ratios.
Normalisation does not happen automatically, because it may require calculating peptide ratios for all peptide matches in the results file. If normalisation is configured, you can cause the normalisation constants to be computed using normalisePeptideRatios(). The individual normalisation constants can be read out with getPeptideRatioNormalisationBase(). Note that changing quantitation method parameters (especially normalisation settings) or editing report ratios will reset the normalisation constants, for consistency.
If normalisation is not configured, then normalisePeptideRatios()
simply returns without doing anything. This means that normally you always want to call normalisePeptideRatios()
after creating an ms_ms2quantitation
object, just to ensure the values match the quantitation method configuration. This is what we do in the following code examples.
If you already know the normalisation constants, or if you want to use your own values, you can use setPeptideRatioNormalisationBase() and skip the potentially time-consuming computation.
ms_ms2quantitation
has limited support for normalisation at component intensity level. If your data is Reporter, you can enable the "sum" normalisation in the quantitation method settings and call normaliseIntensities() to cause the intensity-level normalisation constants to be computed. The method is exactly analogous to normalisePeptideRatios()
, so same caveats apply.
ms_ms2quantitation
supports caching, but does not create a cache file itself. If the results file is Reporter or Multiplex, and caching is enabled, then ms_mascotresults
will automatically extract and cache component intensities in the results cache file. ms_ms2quantitation
will then read component intensities directly from it.
If the results cache file was created by an older version of Parser without full quantitation support, or if caching isn't in use, then component intensities are extracted from spectra as they are needed. In this case, the larger the file, the longer the ms_ms2quantitation
constructor will take to run; the class needs to load component intensities for all quantifiable peptide matches to enable ratio normalisation.
Component intensity values can be accessed directly using ms_mascotresults::getComponentIntensity() and ms_ms2quantitation::getComponentIntensities().
Once you have an ms_ms2quantitation object, you can access peptide and protein ratios using getPeptideRatio() and getProteinRatio(), respectively. The following example prints the ratios of hit 1:
qm = quant.getQuantitationMethod() ratioNames = [] for i in range(qm.getNumberOfReportRatios()): r = qm.getReportRatioByNumber(i) ratioNames[i] = r.getName() hit = pepsum.getHit(1) print("%s :" % hit.getAccession()) for ratioName in ratioNames: r = quant.getProteinRatio(hit.getAccession(), hit.getDB(), ratioName) if r.isMissing(): print(" %s = (undefined)" % r.getRatioName()) else if r.isInfinite(): print(" %s = infinite" % r.getRatioName()) else: print(" %s = %s (n = %d stderr = %s)" % (r.getRatioName(), r.getValue(), r.getSampleSize(), r.getStandardError()))
You can access the component intensities of each peptide ratio with getComponentIntensities().
A number of parameters change how peptide and protein ratios are calculated. These are encapsulated in the quantitation method object, ms_quant_method.
In the default workflow above, quantitation method parameters are read directly from the results file using the default XML schema. Certain parameters can be changed after the ms_ms2quantitation
object has been created. These are normalisation, outlier detection, protein ratio calculation and including or excluding particular peptide ratios from protein ratio calculation. For example, you can change outlier detection in the following way:
qo = msparser.ms_quant_outliers() qo.setMethod("dixons") quant.setQuantOutliers(qo)
Changes take effect immediately; for example, values from getProteinRatio()
may change after changing outlier detection, as some peptides may be excluded from protein ratio calculation.
It is not possible to change peptide quality settings in this way. (Quality settings determine which input queries are read from the results file, and changing the settings after the fact would require re-reading the results file.) You can use the default quantitation method as base configuration, alter the peptide-level settings, and then give the new method as argument to the constructor as follows:
embeddedQm = resfile.getQuantitationMethod() if not qm: # not a quant results file return # Note that we must create a copy, as embeddedQm is supposed to be const: qm = msparser.ms_quant_method(embeddedQm) qm.getQuality().setMinPrecursorCharge(2) quant = msparser.ms_ms2quantitation(pepsum, qm) if not quant.isValid(): # report quant.getLastErrorString() return quant.normalisePeptideRatios()
You can alter nearly all quantitation method settings this way. There are only two exceptions:
Please refer to the quantitation configuration help for more details on quantitation method parameters and how they affect ratio calculation.
The example below shows how you can create an ms_ms2quantitation object, then change the default parameters and print the report ratios of the first protein hit. The example is given in the four programming languages supported by Parser:
const ms_quant_method *embeddedQm = resfile.getQuantitationMethod(); if (!embeddedQm) { // not a quant results file return; } ms_quant_method qm = *embeddedQm; // Change a peptide quality setting: qm.getQuality()->setMinPrecursorCharge(2); ms_ms2quantitation quant = ms_ms2quantitation(pepsum, qm); if (!quant.isValid()) { // report quant.getLastErrorString() return; } // Change a peptide ratio outlier setting (could be done before quant is // created): ms_quant_outliers qo = qm.getOutliers() ? qm.getOutliers() : ms_quant_outliers(); qo.setMethod("dixons"); quant.setQuantOutliers(qo); // Then normalise, if normalisation is enabled. quant.normalisePeptideRatios(); const ms_quant_method *currentQm = quant.getQuantitationMethod(); std::vector<std::string> ratioNames; for (int i = 0; i != currentQm->getNumberOfReportRatios(); i++) { const ms_quant_ratio *r = currentQm->getReportRatioByNumber(i); ratioNames.push_back(r->getName()); } const ms_protein *hit = pepsum.getHit(1); std::cout << hit->getAccession() << ":" << std::endl; for (std::vector<std::string>::size_type i = 0; i != ratioNames.size(); i++) { ms_protein_quant_ratio r = quant.getProteinRatio(hit->getAccession(), hit->getDB(), ratioNames[i]); if (r.isMissing()) { std::cout << " " << r.getRatioName() << " = (undefined)" << std::endl; } else if (r.isInfinite()) { std::cout << " " << r.getRatioName() << " = infinite" << std::endl; } else { std::cout << " " << r.getRatioName() << " = " << r.getValue() << " (n = " << r.getSampleSize() << " stderr = " << r.getStandardError() << ")" << std::endl; } }
my $embeddedQm = $resfile->getQuantitationMethod(); if (not $embeddedQm) { # not a quant results file return; } my $qm = new msparser::ms_quant_method($embeddedQm); # Change a peptide quality setting: $qm->getQuality()->setMinPrecursorCharge(2); my $quant = new ms_ms2quantitation($pepsum, $qm); if (not $quant->isValid()) { # report $quant->getLastErrorString() return; } # Change a peptide ratio outlier setting (could be done before quant is # created): my $qo = $qm->getOutliers() ? $qm->getOutliers() : new msparser::ms_quant_outliers(); $qo->setMethod("dixons"); $quant->setQuantOutliers($qo); # Then normalise, if normalisation is enabled. $quant->normalisePeptideRatios(); my $currentQm = $quant->getQuantitationMethod(); my @ratioNames = (); for my $i (0 .. $currentQm->getNumberOfReportRatios()-1) { my $r = $currentQm->getReportRatioByNumber($i); push @ratioNames, $r->getName(); } my $hit = $pepsum->getHit(1); print $hit->getAccession(), ":\n"; for my $ratioName (@ratioNames) { my $r = $quant->getProteinRatio($hit->getAccession(), $hit->getDB(), $ratioName); if ($r->isMissing()) { print " ", $r->getRatioName(), " = (undefined)\n"; } elsif ($r->isInfinite()) { print " ", $r->getRatioName(), " = infinite\n"; } else { print " ", $r->getRatioName(), " = ", $r->getValue(), " (n = ", $r->getSampleSize(), " stderr = ", $r->getStandardError(), ")", "\n" ; } }
ms_quant_method embeddedQm = resfile.getQuantitationMethod(); if (embeddedQm == null) { // not a quant results file return; } // Change a peptide quality setting: ms_quant_method qm = new ms_quant_method(embeddedQm); qm.getQuality().setMinPrecursorCharge(2); ms_ms2quantitation quant = new ms_ms2quantitation(pepsum, qm); if (!quant.isValid()) { // report quant.getLastErrorString() return; } // Change a peptide ratio outlier setting (could be done before quant is // created): ms_quant_outliers qo = qm.getOutliers() ? qm.getOutliers() : new ms_quant_outliers(); qo.setMethod("dixons"); quant.setQuantOutliers(qo); // Then normalise, if normalisation is enabled. quant.normalisePeptideRatios(); ms_quant_method currentQm = quant.getQuantitationMethod(); Vector<String> ratioNames; for (int i = 0; i != currentQm.getNumberOfReportRatios(); i++) { ms_quant_ratio r = currentQm.getReportRatioByNumber(i); ratioNames.add(r.getName()); } ms_protein hit = pepsum.getHit(1); System.out.println(hit.getAccession() + ":"); for (String ratioName: ratioNames) { ms_protein_quant_ratio r = quant.getProteinRatio(hit.getAccession(), hit.getDB(), ratioName); if (r.isMissing()) { System.out.println(" " + r.getRatioName() + " = (undefined)"); } else if (r.isInfinite()) { System.out.println(" " + r.getRatioName() + " = infinite"); } else { System.out.println(" " + r.getRatioName() + " = " + r.getValue() + " (n = " + r.getSampleSize() + " stderr = " + r.getStandardError() + ")" ); } }
embeddedQm = resfile.getQuantitationMethod() if not embeddedQm: # not a quant results file return qm = msparser.ms_quant_method(embeddedQm) # Change a peptide quality setting: qm.getQuality().setMinPrecursorCharge(2) quant = msparser.ms_ms2quantitation(pepsum, qm) if not quant.isValid(): # report quant.getLastErrorString() return # Change a peptide ratio outlier setting (could be done before quant is # created): qo = qm.getOutliers() ? qm.getOutliers() : msparser.ms_quant_outliers() qo.setMethod("dixons") quant.setQuantOutliers(qo) # Then normalise, if normalisation is enabled. quant.normalisePeptideRatios() currentQm = quant.getQuantitationMethod() ratioNames = [] for i in range(currentQm.getNumberOfReportRatios()): r = currentQm.getReportRatioByNumber(i) ratioNames[i] = r.getName() hit = pepsum.getHit(1) print("%s:" % hit.getAccession()) for ratioName in ratioNames: r = quant.getProteinRatio(hit.getAccession(), hit.getDB(), ratioName) if r.isMissing(): print(" %s = (undefined)" % r.getRatioName()) else if r.isInfinite(): print(" %s = infinite" % r.getRatioName()) else: print(" %s = %d (n= %d stderr = %s)" % (r.getRatioName(), r.getValue(), r.getSampleSize(), r.getStandardError()))
ms_quant_method embeddedQm = new ms_quant_method(); bool bQuant = resfile.getQuantitationMethod(embeddedQm); if (!bQuant) { // not a quant results file return; } // Change a peptide quality setting: ms_quant_method qm = new ms_quant_method(embeddedQm); qm.getQuality().setMinPrecursorCharge(2); ms_ms2quantitation quant = new ms_ms2quantitation(pepsum, qm); if (!quant.isValid()) { // report quant.getLastErrorString() return; } // Change a peptide ratio outlier setting (could be done before quant is // created): ms_quant_outliers qo = qm.haveOutliers() ? qm.getOutliers() : new ms_quant_outliers(); qo.setMethod("dixons"); quant.setQuantOutliers(qo); // Then normalise, if normalisation is enabled. quant.normalisePeptideRatios(); ms_quant_method currentQm = quant.getQuantitationMethod(); List<string> ratioNames = new List<string>(); for (int i = 0; i != currentQm.getNumberOfReportRatios(); i++) { ms_quant_ratio r = currentQm.getReportRatioByNumber(i); ratioNames.Add(r.getName()); } ms_protein hit = pepsum.getHit(1); Console.WriteLine("{0}:", hit.getAccession()); foreach (string ratioName in ratioNames) { ms_protein_quant_ratio r = quant.getProteinRatio(hit.getAccession(), hit.getDB(), ratioName); if (r.isMissing()) { Console.WriteLine(" {0} = (undefined)", r.getRatioName()); } else { Console.WriteLine(" {0} = {1} (n = {0} stderr = {0})", r.getRatioName(), r.getValue(), r.getSampleSize(), r.getStandardError()); } }
It is possible to use ms_ms2quantitation
with combined results files, as long as all files have identical settings and search parameters. (They cannot be combined otherwise; see Combining multiple results files). After creating the ms_ms2quantitation
object, you can change quantitation method settings and access peptide and protein ratios as above.
There is an additional problem, though: peptide ratio definitions may differ between files. For example, suppose you have combined two iTRAQ 4-plex results files. The first file contains ratios "115/114", "116/114" and "117/114", while the second one contains "114/115", "116/115" and "117/115" – that is, a different channel is used as the reference component.
What should happen now? The ms_ms2quantitation
class merges together ratio definitions with the same name. Imagine peptide ratios arranged in a table such that a row contains the report ratios for a particular peptide (q,p), and the columns are the ratios themselves ("115/114", "116/114", "117/114"). Merging in this context simply means peptide ratios with the same name appear in the same column, regardless of which file it comes from. Since the ratio names differ between files, the combined table will have six columns. Peptides from the first file will have a null value for ratios defined in the second file, and vice versa.
A similar problem exists even when ratio names are identical. For example, "115/114" in the first file may be the control, while the control in the second file is actually "116/114".
The solution to the first problem is to redefine the report ratio formulas in the second file, while the second problem is solved by renaming the ratios (e.g. "Control" for "115/114" in the first file and "Control" for "116/114" in the second file). Both can be accomplished using ms_ms2quantitation::setCombinedReportRatio(). Report ratio definition changes take effect immediately; no recalculation needs to be done. After changing the ratios, getPeptideRatio()
and getProteinRatio()
will return values according to the new definitions. You can enumerate the combined list of ratio names using ms_ms2quantitation::getCombinedReportRatioNames().
If your results file uses the Precursor, Replicate or Average protocols, or if you have a custom protocol that doesn't fit neatly in any category, you can't use ms_ms2quantitation
, because the class expects to extract Reporter or Multiplex component intensities from the MS/MS data. Precursor, Replicate and Average require data from the original raw file, and there is no mechanism to feed in the raw file in Parser.
If you need this level of functionality, you need to use e.g. Mascot Distiller. Mascot Distiller quantitation results from these protocols can be extracted using Mascot Parser, see Accessing Distiller Quantitation Results.
If you have calculated peptide ratios in some way (e.g. by extracting component intensities from XICs), then you can use ms_customquantitation, which is the sibling class to ms_ms2quantitation
. The class shares much of the same interface. In particular, both classes implement the same protein ratio calculation and significance testing, while ms_customquantitation
allows greater control over peptide ratios.
In Reporter and Multiplex protocols, each peptide ratio is associated with a single peptide match (MS/MS spectrum). This is not necessarily the case with other protocols, where the peptide ratio can be calculated from a combination of features. In custom quantitation, each peptide ratio is associated with a peptide quant key, which fully identifies the ratio. The peptide quant key can be, for example, the query and rank of an identified peptide (in Reporter and Multiplex) or it can be any unique string value.
You will need to define a mapping from protein accessions to peptide quant keys for protein ratio calculation purposes; that is, you need to define which peptide ratios should be used in calculating a protein ratio. When you call getProteinRatio()
, the class will look up all unique peptide ratios associated with the peptide quant keys associated with the protein accession, and use that set of ratios to calculate the protein ratio (the algorithm is the same as in ms_ms2quantitation
). For this to work, you also need to configure protein ratio calculation settings in the quantitation method, or define outlier, normalisation and other settings after creating the ms_customquantitation
object.
ms_customquantitation
can be used in a number of ways, and the right way depends on problem context. An example is sketched out below, using Python-like pseudocode syntax:
qm = resfile.getQuantitationMethod(...); if not qm: return; qm.getNormalisation().setMethod("none"); qm.getOutliers().setMethod("none"); qm.setMinNumPeptides(2); qm.setProteinRatioType("average"); quant = msparser.ms_customquantitation(qm); if not quant.isValid(): # report quant.getLastErrorString() return; # Set normalisation constants here if needed: #quant.setNormalisationBase(..., ...); for (loop over some data): q = ...; p = ...; ratioName = ...; value = ...; key = msparser.ms_peptide_quant_key(q, p) r = msparser.ms_peptide_quant_ratio(key, ratioName, value); # Add peptide ratio: quant.addPeptideRatio(r); accession = ...; dbIdx = ...; # Push it to the protein-peptide mapping: quant.addPeptideQuantKey(accession, dbIdx, key);
If you have a Mascot results file and open it as ms_peptidesummary
, you can give the object as a parameter to ms_customquantitation
, which will initialise the peptide-protein mappings directly from the results file. This saves quite a bit of work, and you are still able to edit the mappings later. For example:
# This also pulls the quantitation method from the results file. quant = msparser.ms_customquantitation(pepsum); if not quant.isValid(): # report quant.getLastErrorString() return; # Alter default settings: qo = msparser.ms_quant_outliers(); qo.setMethod("none"); quant.setQuantOutliers(qo); quant.setMinNumPeptides(2); quant.setProteinRatioType("average"); # Set normalisation constants here if needed: #quant.setNormalisationBase(..., ...); for (loop over some data): q = ...; p = ...; ratioName = ...; value = ...; key = msparser.ms_peptide_quant_key(q, p) r = msparser.ms_peptide_quant_ratio(q, p, ratioName, value); # No need to use addPeptideQuantKey() -- mapping is already there.
If you are using ms_customquantitation
, you may need to determine whether a peptide match is worth quantitating (for example, it may not be a statistically significant match). The ms_quant_helper class provides two methods to this end. The process is as follows, assuming qm
is a quantitation method object that defines both peptide quality settings and the components:
umodfile = msparser.ms_umod_configfile(); if (!resfile.getUnimod(umodfile, true)): # log reason return; qhelp = msparser.ms_quant_helper(pepsum, qm, umodfile); if (!qhelp.isValid()): # log reason return; for (loop over queries q and ranks p to quantitate): (qres, reason) = qhelp.isPeptideQuantifiable(q, p) if qres != msparser.ms_quant_helper.PEPTIDE_IS_QUANTIFIABLE: # log reason and skip peptide continue; (qualres, reason) = qhelp.isPeptideQualityOK(q, p) if qualres != msparser.ms_quant_helper.PEPTIDE_QUALITY_IS_OK: # log reason and skip peptide continue; # Add additional tests here if needed; otherwise the peptide contains the # required modifications (if specified in this protocol) needed for # quantitation and passes the score thresholds (if defined in quality # settings), so it can be quantified.
Additional tests can be, for example, testing the peptide sequence for uniqueness with ms_mascotresults::isPeptideUnique().
If you have an ms_ms2quantitation
object, you can use it also as an argument to the constructor. The constructor will pull all quantitation method settings, peptide-protein mappings and peptide ratios from the object and store them internally. You can then edit the peptide ratios or peptide-protein mappings if needed. The constructor documentation gives more details.
Note that ms_customquantitation
does not support caching of any kind, as it is very difficult to define what goes in the cache file and what does not. If you need to cache data, you can use ms_tinycdb to read and write cache files.
The Average protocol differs from the other four. First of all, the concept of peptide ratio has no meaning in the protocol. Protein ratios are instead calculated directly from the top three (or top N) peptide intensities assigned to the protein hit.
Second, only one protein ratio type is supported: average
. The average
protein ratio type has special meaning when protocol is Average, in that the protein ratio is simply the sum of its top N peptide intensities, divided by the sum of the top N peptide intensities of a named reference protein. In other protocols, average
means the geometric mean of peptide ratios.
Third, outlier testing has no meaning in the Average protocol. It is not possible to use any other setting for outlier removal than none
.
We suggest the following workflow for using Average with ms_customquantitation:
ms_customquantitation
object with a quantitation method object configured for the Average protocol. ms_peptide_quant_ratio
objects: let the single ratio name be "intensity" and its value simply the raw intensity. getProteinRatio()
, where the ratio name is "intensity". The values will be relative to the intensity of the reference protein, taking the configured reference amount into account. Note that peptide and protein ratio normalisation is meaningless in the Average protocol.
Statistical routines used by other quantitation classes are collected in ms_quant_stats. The class is used internally by ms_ms2quantitation
and ms_customquantitation
to determine peptide ratio outliers and calculate the protein ratio p-value and statistical significance. You can use the class to calculate protein ratios manually in the following way, assuming quant
contains an ms_quantitation object or one of its child classes (ms_ms2quantitation
, ms_customquantitation
):
keys = quant.getPeptideQuantKeys(accession, dbIdx); ratioName = ...; # the ratio we're interested in rawRatios = []; for i in range(0, keys.size()-1): if not quant.hasPeptideRatio(keys.get(i), ratioName): continue; r = quant.getPeptideRatio(keys.get(i), ratioName); if not r.isInfinite(): rawRatios.append(r.getValue()); outlierIndices = msparser.ms_quant_stats.detectOutliers(rawRatios, "auto"); ratios = []; for i in range(0, len(rawRatios)-1): is_outlier = false; for j in range(0, len(outlierIndices)-1): if i == outlierIndices[j]: is_outlier = true; break; if is_outlier: continue; ratios.append(rawRatios[i]); median = msparser.ms_quant_stats.unsortedMedian(ratios); logratios = msparser.ms_quant_stats.logTransform(ratios); logmean = msparser.ms_quant_stats.arithmeticMean(logratios); logstdev = msparser.ms_quant_stats.arithmeticStandardDeviation(logratios, logmean); logstderr = msparser.ms_quant_stats.arithmeticStandardErrorOfMedian(logstdev, len(ratios)); stderr = exp(logstderr)
ms_quant_stats
also contains functions for calculating the cumulative distribution function and critical values of a few commonly used distributions. Please consult the class documentation for more details.