To view this email as a web page, click here.

newsletter banner

Welcome

We have some suggestions for helping you determine the optimum license size for your Mascot Server.

This month's highlighted publication shows the complementary use of peptide fingerprints and MS/MS to characterise ancient proteins. If you have a recent publication that you would like us to consider for an upcoming Newsletter, please send us a PDF or a URL.

Mascot tip of the month concerns searching the complete NCBI nr database.

Please have a read and feel free to contact us if you have any comments or questions.

 

March 2021

Mascot Server license size
Featured publication
Mascot tip of the month
 

How large a Mascot Server license do I need?

A common question is: How many CPU's do I need to license in order to have adequate search speed? It is reasonable to assume you want to have the search engine analyze the data quicker than your mass spectrometer(s) can acquire it. This will depend on both the computer hardware and the search parameters, and the best way to assess this is to process some example files.

We ran a publicly-available data set from a CPTAC project using the published search conditions. With a 1 CPU Mascot Server license (4 cores), searches were completed in 2 to 3 minutes per fraction. Data acquired over seven days could be searched in less than 8 hours. Under other circumstances, a larger licence might be required:

  • If the samples were from an unknown bacterial species and had to be searched against all the bacterial sequences, each fraction may take 20 to 30 minutes. A 2 or 3 CPU license might be the better option.
  • If you run a core lab or the instrument is operating 24/7, then a 2-4 CPU license might be more suitable.
  • If you are a large core lab or a lab with multiple instruments, you might need 5 CPU or more.

Go here to read more details about sizing your Mascot Server license.

hardware

Featured publication using Mascot

Here we highlight a recent interesting and important publication that employs Mascot for protein identification, quantitation, or characterization. If you would like one of your papers highlighted here please send us a PDF or a URL.

 

Proteome Variation with Collagen Yield in Ancient Bone

Noemi Procopio, Rachel J.A. Hopkins, Virginia L. Harvey, and Michael Buckley

J. Proteome Res. 2021 Publication: February 2, 2021

The authors investigated the survival of proteins in ancient bone specimens from four different archaeological cave sites in Bulgaria and Hungary. The 29 bone samples were drilled to create bone powder that was solvent washed, then sequentially washed and incubated with HCl. After buffer exchange, the samples were digested with trypsin and analyzed by MALDI-TOF (peptide fingerprint) and LC-MS/MS. The spectra were searched against the Swiss-Prot database without any taxonomy filter.

Using MALDI-TOF (which they call Zooarchaeology by Mass Spectrometry or ZooMS), the authors were able to establish the taxonomy for 22 out of 29 samples to bovine (cattle or bison) (n = 6), horse (n = 9), human (n = 2), and cervine (red deer, elk) (n = 5). LC-MS/MS data was used to further refine the taxonomic classifications, verifying or generating species-level identifications for 25 of the 29 samples, with the remaining suspected as bovidae/cervidae.

No trend was observed between a sample's "collagen yield" (for radiocarbon dating purposes) and the proteome complexity. Thus, samples that appear to have poor collagen yield can still provide valuable proteomic information.

Thumbnail from featured publication

Mascot Tip

We get a steady trickle of support questions relating to the NCBI nr database. The sheer size of this database creates a number of problems. The compressed file is over 90 GB, so downloading is a challenge unless your connection is very fast. The unpacked Fasta is almost double this size so, if you want to keep it up to date, you need to allocate more than half a terabyte of disk space to allow for the presence of two sets of files. There are some 350 million proteins, so building the taxonomy and accession string indexes required by Mascot takes many hours. And, of course, searches take a very long time, use a lot of RAM, and generate bloated result files and hugely redundant result reports.

Unless you have a very strong reason to have the complete database online, far better to download sub-sets of entries for individual species or families or, in extreme cases, classes. This help page illustrates how to proceed.

Always remember that, the larger the search space, the more difficult it is to get statistically significant matches. For the vast majority of searches, you will get better coverage from a database that accurately represents the proteome of interest, such as a Uniprot proteome.

Mascot tip

About Matrix Science

Matrix Science is a provider of bioinformatics tools to proteomics researchers and scientists, enabling the rapid, confident identification and quantitation of proteins. Mascot software products fully support data from mass spectrometry instruments made by Agilent, Bruker, Sciex, Shimadzu, Thermo Scientific, and Waters.

Please contact us or one of our marketing partners for more information on how you can power your proteomics with Mascot.

 

Matrix Science logo

Matrix Science Ltd, 64 Baker Street, London W1U 7GB, UK
T +44 (0)20 7486 1050  F +44 (0)20 7224 1344  E info@matrixscience.com
 

View in a web browser Forward to a colleague Unsubscribe