Obsolete export formats
This page documents export formats that are no longer supported by Mascot. We recommend exporting results in Mascot XML, Mascot CSV, mzTab or mzIdentML format.
pepXML
Note: The pepXML format exported by Mascot is very old, version 1.8, and does not support features added in recent versions of Mascot.
pepXML is the interchange format for database search results used in the Institute for Systems Biology Trans-Proteomic Pipeline.
The pepXML format is only applicable to MS/MS search results, and represents "raw" peptide match data. Information is exported for all matches to all queries, (MS/MS spectra). For each match, extensive information is provided for the first protein in which the peptide is found and more limited information for all the other proteins. This can make the output file very large.
Precise details for individual data items, such as the data type and whether it is optional, can be found in the XML schema. Schema documentation has been generated by xs3p. For general XML Schema considerations, see the section further down this page.
Usage
For speed and efficiency, leave all the checkboxes under Optional Protein Hit Information unchecked. (See Optional Protein Hit Information for further information on the use of these checkboxes).
Limitations
- Where elements and attributes are required by the schema, but the data is not available from Mascot, zero length strings are output. For example, the base_name, raw_data_type and raw_data attributes of an msms_run_summary element.
- The schema includes extensive information for the first protein in which a peptide match is found, even though this may not be the preferred or final assignment.
- The amino acid residues that bracket a peptide are only available if the result file is from Mascot 2.1 or later.
- The num_matched_ions attribute of the search_hit element is the number of mass values used to score the match, not the total number of mass values that could be matched to all the calculated ion series.
- In a search_result element, the start_scan and end_scan attributes are always set to 0.
- modification_info elements are only exported for variable modifications, not for fixed.
DTASelect
Note: The DTASelect format exported by Mascot is very old, corresponding to DTASelect 1.9, and does not support features added in recent versions of Mascot.
DTASelect is an application that was written by David L. Tabb at The Scripps Research Institute. Originally intended for analysing Sequest results, it groups peptide matches into proteins and allows a variety of filters to be applied. Although DTASelect includes built-in support for Mascot result files, the information in the result file is not fully utilised and the interface is prone to break with new Mascot releases. Choosing DTASelect in this export utility creates a DTASelect intermediate file, DTASelect.txt, containing a more complete picture of the search results. This intermediate file is then read by DTASelect to create filtered reports.
The output file is compatible with DTASelect 1.9 only. DTASelect format is only applicable to MS/MS search results.
Usage
For speed and efficiency, it is advisable to choose MudPit scoring, an ions score cut-off of 10, and leave all the checkboxes under Optional Protein Hit Information unchecked. (See Optional Protein Hit Information for further information on the use of these checkboxes). Save the exported file to a directory, make this the current directory, and execute DTASelect.
The DTASelect spectrum filters, which can be supplied on the command line or taken from DTASelect.params, should include the following changes to the defaults:
- –Mascot
- to set Mascot mode
- -1 10.0
- to set the minimum ions score for 1+ peptides to 10
- -2 10.0
- to set the minimum ions score for 2+ peptides to 10
- -3 10.0
- to set the minimum ions score for 3+ peptides to 10
- -d 20
- to set the minimum for (1 / expectation value) to 20
- -p 1
- to set the distinct peptide threshold to 1
- –mw 100.0
- to set the minimum protein mass to 100
In a DTASelect report of Mascot results, the following columns are different from those in a DTASelect report of Sequest results:
- Filename
- Mascot result filename, query number and precursor charge, separated by periods
- IonsScore
- Mascot ions score
- Signif
- 1 / expectation value
- SpR
- Peptide match rank, between 1 (highest) and 10 (lowest)
- SpScore
- Identity threshold score
Limitations
- The output file is compatible with DTASelect 1.9 only
- Hyperlinks to Sequest utilities will not work
- The number of tryptic termini for a peptide is not available
- The amino acid residues that bracket a peptide are only available if the result file is from Mascot 2.1 or later. For result files from earlier versions, question marks are displayed
- DTASelect reports do not display variable terminus modifications