Blog
Articles tagged: Unimod
O-fucosylated CID spectra
O-linked fucose is easily lost in CID. A recent paper by Swearingen et al. in the Journal of Proteome Research discusses this in the context of identifying O-fucosylated thrombospondin type 1 repeats (TSRs) in Plasmodium parasites using database searching. The main problem is, the O-glycosidic bond is weaker than the peptide backbone. Collision energies typical for peptide fragmentation cause it [...]
Results round-up for the ‘dark matter’ challenge
In June, we tried to harness the power of crowd-sourcing to explain some of the unidentified modifications found in open database searches. We selected 20 abundant and unassigned mass deltas from Supplementary Table 3 of the recent MSFragger paper from Alexey Nesvizhskii’s group at U. Michigan and offered prizes for the first credible explanations. There were 35 unannotated deltas in [...]
Trying to illuminate proteomics ‘dark matter’
The May 2017 issue of Nature Methods has a paper from Alexey Nesvizhskii’s group at U. Michigan describing a new open database search program called MSFragger. Strikingly, they also observed the two highly abundant but unidentified mass deltas reported in Steven Gygi’s 2015 mass tolerant paper: 301.9864 and 249.9803. The challenges of open searching were discussed in an earlier blog [...]
Selenocysteine
David Fenyö and Ron Beavis have a short paper in J. Proteome Research that draws attention to a potential problem with peptides containing selenocysteine (1-letter code U, 3-letter code Sec). Samples are frequently alkylated, yet modified U is unlikely to be considered in the search. This need not be an issue for Mascot searches, but you may have no modifications [...]
Mass-tolerant vs Error tolerant
"A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides" in Nature Biotechnology is from Steven Gygi’s lab at Harvard Medical School. It describes the use of a very wide precursor mass tolerance, +/- 500 Da, to identify modified peptides in a Sequest search. How does this approach, which the authors call an [...]
PSI file formats, part 3: repositories
We’ve talked about mzIdentML validity only in terms of file structure. Proteomics repositories, such as PRIDE or ProteoRed, of course require files to be valid in that sense, but they impose additional requirements. If you need to upload your search results to a repository, it is worth looking at this more extended idea of validity. For simplicity, I’ll only consider [...]
Modifications round-up, part 2
This is the second of two articles dealing with topics relating to modifications. The first can be found here. Note that Site analysis was covered in an earlier article. Why aren’t amino acid substitutions listed in the search form? Amino acid substitutions are rare and there are lots of them, so the only practical way to use them is in [...]
Modifications round-up, part 1
Much of the complexity in Mascot is associated with modifications. It can be hard to find information about some of aspects of handling modifications unless you already know what you are looking for. In this blog article, the first of two, I’ll collect together some of the topics that come up frequently in support emails. Note that Site analysis was [...]
Non-standard amino acid residues
Mascot only supports the 26 letters of the Latin alphabet as one-letter codes in sequence database entries. And, it is case-insensitive, so you cannot use (say) R and r for different residues. This is quite a limitation if you want to create a custom database that encodes non-standard or modified residues. It isn’t a concern if you search only public [...]