Mascot: The trusted reference standard for protein identification by mass spectrometry for 25 years

Posted by John Cottrell (October 14, 2013)

Modifications round-up, part 2

This is the second of two articles dealing with topics relating to modifications. The first can be found here. Note that Site analysis was covered in an earlier article.

Why aren’t amino acid substitutions listed in the search form?

Amino acid substitutions are rare and there are lots of them, so the only practical way to use them is in an error tolerant search. Listing them in the search form would just create clutter. If you have a special case where a particular substitution is needed in the search form, simply change the classification from AA substitution to any of the other possibilities using the configuration editor.

Exclusive modifications

Exclusive modifications can be thought of as a choice of fixed modifications. In many quantitation experiments, separate samples are derivatised then pooled. Thus, a given peptide may carry one or the other set of modifications, but never a mixture of both. Some people use the term "binary" for this type of specificity. We prefer exclusive because binary implies only two possibilities. The value of exclusive modifications is that they keep the search space small, which avoids the combinatorial explosion that can occur with too many variable modifications.

Exclusive modifications can only be specified as part of a quantitation method; they cannot be selected in the search form. There is a related parameter called Constrain search. If this is set true, exclusive modifications are treated as a choice of fixed modifications during the search. This makes the search space small so that the search is fast and the significance threshold score is kept low. If Constrain search is false, exclusive modifications treated as variable during the search, and it is only during quantitation that they are treated as a choice of fixed modifications. This can be useful if you suspect chemistry problems, and want to see matches to peptides that are only partially modified.

Sometimes, you may be forced to set Constrain search to false because Mascot doesn’t allow one or more of the modifications to be fixed. An example would be a protein N-terminus modification or a modification with multiple neutral losses. Only residue and peptide terminus mods with zero or one neutral losses can be fixed.

Multiple modifications at a single site

Outside of quantitation, there aren’t many occasions when you need to match peptides with multiple modifications at a single residue or terminus. In most cases, multiple modifications don’t occur because of chemistry, e.g. you can’t have Carbamidomethyl (C) and Propionamide (C) on the same cysteine. In other cases, incremental modifications are handled via separate modifications, e.g. Methyl (K), Dimethyl (K), and Trimethyl (K).

The picture for SILAC quantitation is different because SILAC is implemented using modifications. Until recently, if you wanted to search for modifications that target the labelled residues, you had to define combination modifications, and include these in the search as variable modifications. For example, in Unimod, you’ll find modifications such as Label:13C(4)+Oxidation and Label:13C(6)+Acetyl. This is unsatisfactory because it doubles the number of variable modifications required.

In Mascot 2.4, we added support for multiple modifications, but only in the context of a quantitation method. The restrictions are:

  • Two modifications at a single site are allowed, but only one of these can be a variable modification. The other must be an exclusive modification, defined in a quantitation method, such as a SILAC label.
  • Constrain search must be set true in the quantitation method.
  • The search must include the new parameter MULTI_SITE_MODS=1.

This was a major change, requiring a lot of new code in Parser, to handle the possibility of a second modification at the same site, and Distiller, to ensure peptides were grouped correctly. For example, in a 3 component SILAC experiment, using R+6 and R+10, if we had a match to this peptide:
K.QMEQISQFLQAAER.Y + Oxidation (M); Label:13C(6) (R)

It would partner with these, whether or not we also got matches for them:
K.QMEQISQFLQAAER.Y + Oxidation (M)
K.QMEQISQFLQAAER.Y + Oxidation (M); Label:13C(6)15N(4) (R)

It would not partner with:
K.QMEQISQFLQAAER.Y + Oxidation (M); Phospho (ST)
K.QMEQISQFLQAAER.Y + Label:13C(6)15N(4) (R)
etc.

Plenty of scope for bugs! The other concern was that third party code, that did not expect multiple modifications, could easily mis-represent the search results, and display reports with inconsistencies. This was the reason for requiring MULTI_SITE_MODS=1, which ensures that third party software won’t accidentally enable the new behaviour. At present, the only time you are likely to see this new feature is when you submit a search from Distiller.

Be a little bit careful when performing non-SILAC quantitation. For example, if you are using dimethyl labelling, and you specify other variable modifications for K or N-term, there is no chemical intelligence in Mascot to decide whether the variable modification and the dimethyl label could both apply to the same site. If the peptide mass fits to the combination, it will try both together. You shouldn’t get a significant match, of course, if the peptide doesn’t exist, but if you start digging in the low scoring junk, you might find some unlikely looking combinations.

Metabolic labelling

In metabolic labelling, as opposed to SILAC, the label is present throughout the peptide backbone. 15N is the most widely used label, with 13C used less often because of cost. Since the masses of all residues are shifted, it isn’t practical to handle this using modifications. Choose an appropriate quantitation method, e.g. 15N metabolic [MD], even if you don’t intend using Mascot Distiller to perform quantitation. The spectra will be searched twice, once with mass values corresponding to 14N and once with 15N.

Note that it can be difficult to get matches to 15N peptides because even a small degree of under-enrichment causes extensive tailing to the isotope distribution resulting in peptide mass values that can be several Da too low. This isn’t a huge problem for a quantitation experiment, because there will usually be a match to the light peptide from which the mass of the heavy peptide can be inferred. But, if you are analysing labelled peptides only, rather than a mixture of heavy and light, you might find you need to use a very wide precursor mass tolerance.

It isn’t always clear how best to handle modifications in metabolically labelled data. If the modification is due to formal derivatisation or an artefact that occurs in sample handling, the modification itself will usually not be labelled. If it is is a modification that occurs in the labelled media, during protein production, the modification may incorporate the label. The current behaviour in Mascot is that metabolic labelling applies to both residues and modifications. If you want an unlabelled modification, and the modification contains the labelling element, you need to use or create a modification with the same mass shift but without the element. This is not ideal, and may change at some time in the future so that you can control whether the label applies to the modification by defining the modification at search level (no label) or at quantitation component level (labelled).

Most metabolic labelling is 15N and few of the very common modifications contain nitrogen. The main exceptions are Carbamidomethyl (C), which will not normally be labelled, and Deamidation (NQ). The preferred fix is to alkylate with iodoacetic acid rather than iodoacetamide. If this is not possible, create a modification with composition C(4)H(2)Li and use this in place of Carbamidomethyl (C), since the mass difference is only 0.01 Da.

Deamidation can be both post translational and artefactual but, because it is a loss of nitrogen, it will always be handled correctly by Mascot. Note that this means that deamidation of a 15N sample causes no mass shift, which can be a bit of a headache in quantitation if the modified and unmodified peptides are not separated by chromatography, since there will be two overlapping light distributions but only one heavy. (Unless you have an FT-ICR and can make use of the mass shift of just -.013 Da!)

Scan level modifications

Modifications can be defined with different scope. When a variable modification is selected in the search form, it applies to the entire search. When a variable modification is specified at component level in a quantitation method, it only applies that that component. You can also specify a variable modification at scan (or query) level, so that it only applies to that spectrum. Right now, this is little used, because there isn’t any automated way to annotate peak lists with modifications. This is a probably a good thing, because scan level modifications currently contribute to the total count of variable modifications for the search, and using this feature in any general sense can cause the search to fail. This will be fixed in Mascot 2.5, when scan level modifications will be treated fully independently. This could be very useful if the instrument software or some application has a smart way of detecting the presence of a particular modification in a particular spectrum.

Two final points to note: scan level modifications can only be variable, not fixed, and any scan level modifications are combined with those specified at search level, they don’t replace them.

Keywords: , , , ,

2 comments on “Modifications round-up, part 2

  1. matthias glueckmann on said:

    Hello,

    nice information update.

    The option to introduce the C(4)H(2)Li sounds scientifically rather odd to me. In case of a DB search you would need to redo the search for publication. Also the scores will be altered in this case as you will have different metabolic labeling?

    Best regards

    Matthias

    • John Cottrell on said:

      Its a workaround for anyone doing 15N metabolic and alkylating with iodoacetamide. Not ideal, I agree. The mass difference of 0.01 Da is 10ppm for a 1000 Da peptide, so mass tolerances shouldn’t be set too tightly. As mentioned, a better solution is to alkylate with iodoacetic acid