Matrix Science header

Combining multiple .dat files
[Mascot results file module]

Overview

Mascot Parser 2.3.03 and later support combining multiple results files. This is useful, for example when two searches have been performed one with ETD data and another with CID, but you want to combine the results to give a new single ms_peptidesummary.

To combine multiple files, simply create an ms_mascoresfile object for the first .dat file, and then call appendResfile() to add each of the subsequent files to the first file. Then create a new ms_peptidesummary object just for the first .dat file object. For example (but excluding any error handling!):

C++
   ms_mascotresfile resfile1 = ms_mascotresfile("F0001.dat");
   resfile1.appendResfile("F0002.dat");
   resfile1.appendResfile("F0003.dat");
   ms_peptidesummary pepsum = ms_peptidesummary(resfile1,...

Perl
   my $resfile1 = new msparser::ms_mascotresfile("F0001.dat");
   $resfile1->appendResfile("F0002.dat");
   $resfile1->appendResfile("F0003.dat");
   my $pepsum = new msparser::ms_peptidesummary($resfile1,...

Java and C#
   ms_mascotresfile resfile1 = new ms_mascotresfile("F0001.dat");
   resfile1.appendResfile("F0002.dat");
   resfile1.appendResfile("F0003.dat");
   ms_peptidesummary pepsum = new ms_peptidesummary(resfile1,...

Python
   resfile1 = msparser.ms_mascotresfile("F0001.dat")
   resfile1.appendResfile("F0002.dat")
   resfile1.appendResfile("F0003.dat")
   pepsum = msparser.ms_peptidesummary(resfile1,...

Query numbers for the second file are 'renumbered' to start at one greater than the number of queries in the first file. This is all dealt with internally in Mascot Parser at the lowest level. So, it is safe for example, to call getSectionValueStr() with a key "q1000_p1" on the primary resfile even though this query may be in the second resfile.

If you need access to the appended results files, use getResfile(). You may also need to use getSrcQueryAndFileIdForMultiFile() to return a query number in the original resfile.

The appended objects are deleted when the parent ms_mascotresfile object is deleted.

Limitations

Trying to combine results where any of the following differ will result in an error:

Currently, all variable modifications must be the same, and in the same order in each file. This restriction may be changed in a future release.

Combining integrated error tolerant searches (see Integrated error tolerant search) is fully supported. Combining old-style error tolerant searches is not supported.

A combined protein summary is not supported. After calling appendResfile(), the function getNumHits() will return -1. A side effect of this is that getProteinMass() and getProteinDescription() will only work if there is an entry in a protein section in one of this files -- it won't look in the summary sections.

Percolator is not currently supported for combined results files.

Changed and additional API

Additional functions:

Changed functions:

Cache Files

Using the ms_mascotresfile cache files gives details of how to use the 'resfile cache'. When a call is made to the primary ms_mascotresfile object that requires it to look in another results file, it will use the cache if that was specified for the second results file when calling appendResfile().

The ms_peptidesummary cache is slightly more complex, but still works. See Filenames for the ms_mascotresults/ms_peptidesummary cache files. The base part of the cache filename will be the primary results file, but the 26 character hash uses the names of all the subsequent files added using matrix_science::ms_mascotresfile::appendResfile. This ensures that different cache files will be used for opening a results file and a results file with appended files. The name of the cache file is dependent on the order that the additional results files are added.


Copyright © 2022 Matrix Science Ltd.  All Rights Reserved. Generated on Thu Mar 31 2022 01:12:30