Blog
Articles tagged: statistics
Mascot workflow for LC-MS/MS data
Data analysis in mass spectrometry proteomics is complex and, nowadays, almost entirely software driven. Processing a raw file, peptide identification by database searching, protein inference and protein quantitation all have many steps and built-in assumptions, not to mention a huge number of parameters. Software continues to evolve as does best practice. Whether you are new to mass spectrometry proteomics or [...]
Error tolerant searches now show statistical significance
The latest release of Mascot Server introduces some important changes to error tolerant searches. Matches from the second pass search now have expect values attached, indicating confidence levels. These are either estimates based on counting trials or empirical values derived from searching a decoy database. If you are not familiar with the error tolerant search, now is the time to [...]
Using the Quantitation Summary to create reports and charts
An earlier article described how to create a Quantitation Summary in Mascot Daemon. This is a spreadsheet-like text file, where the rows correspond to proteins and the columns contain expression data for various samples in the form of abundances or ratios of abundances. A Quantitation Summary can be opened and manipulated in a spreadsheet program such as Excel, and it [...]
Tabulate expression data from multiple analyses with Mascot Daemon
Studies that use mass spectrometry-based quantitation often contain large numbers of individual analyses: samples from different sources or treatments or time points, possibly fractionated, with replicates and so forth. Using statistical methods to combine the analyses, extract meaningful information, and report it as charts and tables is a complex task that usually requires custom scripting in a language such as [...]
Protein FDR in Mascot Server 2.7
One of the new features in Mascot Server 2.7, now running on this web site, is an estimate of protein FDR. This is displayed in the Protein Family Summary for Fasta searches whenever automatic decoy is selected. The basis is the number of proteins inferred in the target database compared with the number in the decoy database. Conceptually, this is [...]
What are you inferring?
Benchmarking protein inference is notoriously difficult. Artificial samples of known content tend to be too simple while real samples lack ground truth. An interesting approach was adopted for the ABRF iPRG 2016 study, and has been the subject of a publication from The et al. A collection of human Protein Epitope Signature Tags (PrESTs) were expressed in E. coli and [...]
Back to basics 5: Peptide-spectrum match statistics
Mascot can identify peptides in uninterpreted MS/MS data. Observed spectra are submitted to Mascot as search queries. A query specifies the precursor ion m/z and charge state as well as the MS/MS peak list. Mascot digests protein sequences from the chosen database and selects peptide sequences whose mass is within the specified tolerance of the query’s precursor mass. The software [...]
Mascot workflows in Proteome Discoverer
For many users of Thermo instruments, Proteome Discoverer (PD) is their primary user interface for database searching, and Mascot is represented by a node in the workflow. This article collects together a few tips and observations concerning Proteome Discoverer 2.3 and Mascot Server 2.6. Proteome Discoverer Configuration Under Administration; Mascot Server, the setting Max. MGF File Size [MB] has a [...]
Back to basics 3: Quantitation statistics
Mascot Server and Distiller support a number of different quantitation methods. These methods are carried out at the peptide level, the peptides are then grouped into protein families, and the peptide quantitation results used to calculate protein ratio values. Mascot and Distiller perform a number of statistical procedures and tests to give you an indication of the quality and reliability [...]
Some peaks are more equal than others
When you look at the details of a peptide match in the Mascot Peptide View report, only a small number of the peaks may be labelled in the spectrum graphic and highlighted in the table of fragment masses. We often got challenged about this: "Why haven’t you labelled these other peaks that clearly match?". So, in Mascot 2.3, we added [...]