Matrix Science Mascot Parser toolkit
 
Loading...
Searching...
No Matches
ms_mascotresfilebase Class Referenceabstract

Abstract base class of ms_mascotresfile_dat and ms_mascotresfile_msr. More...

#include <ms_mascotresfilebase.hpp>

Inheritance diagram for ms_mascotresfilebase:
Collaboration diagram for ms_mascotresfilebase:

Public Types

enum  FLAGS {
  RESFILE_NOFLAG = 0x00000000 ,
  RESFILE_USE_CACHE = 0x00000001 ,
  RESFILE_CACHE_IGNORE_ACC_DUPES = 0x00000002 ,
  RESFILE_USE_PARENT_PARAMS = 0x00000004 ,
  RESFILE_CACHE_IGNORE_DATE_CHANGE = 0x00000008
}
 Flags for opening the results file. More...
 
enum  KA_TASK {
  KA_CREATEINDEX_CI = 0 ,
  KA_READFILE_RF = 1 ,
  KA_ASSIGNPROTEINS_AP = 2 ,
  KA_GROUPPROTEINS_GP = 3 ,
  KA_UNASSIGNEDLIST_UL = 4 ,
  KA_QUANTITATION = 5 ,
  KA_CREATECACHE_CC = 6 ,
  KA_THRESHFORFDR_FDR = 7 ,
  KA_LAST = 8
}
 Processing some results files is computationally intensive. These are the tasks that can be performed. More...
 
enum  PERCOLATOR_FILE_NAMES {
  PERCOLATOR_INPUT_FILE = 0 ,
  PERCOLATOR_OUTPUT_TARGET = 1 ,
  PERCOLATOR_OUTPUT_DECOY = 2
}
 Offsets into a vector of Percolator filenames. More...
 
enum  RESFILE_TYPE {
  RESFILE_DAT28 = 0 ,
  RESFILE_MSR = 1 ,
  RESFILE_UNKNOWN = 2
}
 Supported results file formats. More...
 
enum  XML_SCHEMA {
  XML_SCHEMA_QUANTITATION = 0 ,
  XML_SCHEMA_UNIMOD = 1 ,
  XML_SCHEMA_DIRECTORY = 2 ,
  XML_SCHEMA_CROSSLINKING = 3 ,
  XML_SCHEMA_LAST = 4
}
 The results file contains embedded files in XML format and these need to be validated against a schema. More...
 

Public Member Functions

 ms_mascotresfilebase (const char *szFileName, const int keepAliveInterval=0, const char *keepAliveText="<!-- %d seconds -->\n", const unsigned int flags=RESFILE_NOFLAG, const char *cacheDirectory=0, const char *XMLschemaDirectory=0, ms_progress_info *progressMonitor=0)
 ms_mascotresfilebase is an abstract class; use createResfile().
 
virtual bool anyCrosslinkedMatches (const bool isDecoy=false) const =0
 Returns true if there are any Crosslinked matches.
 
virtual bool anyErrorTolerantMatches (const bool isDecoy=false) const =0
 Returns true if there are any Error Tolerant matches.
 
virtual bool anyFastaMatches (const bool isDecoy=false) const =0
 Returns true if there are any FASTA matches.
 
virtual bool anyMSMS () const =0
 Returns true if any of the queries in the search contain ions data.
 
virtual bool anyPMF () const =0
 Returns true if any of the queries in the search just contain a single peptide mass.
 
virtual bool anySpectralLibraryMatches (const bool isDecoy=false) const =0
 Returns true if there are any Spectral Library matches.
 
virtual bool anySQ () const =0
 Returns true if any of the queries in the search contain seq or comp commands.
 
virtual bool anyTag () const =0
 Returns true if any of the queries in the search contain tag or etag commands.
 
void appendErrors (const ms_errors &src)
 Copies all errors from another instance and appends them at the end of own list.
 
virtual int appendResfile (const char *filename, int flags=RESFILE_USE_PARENT_PARAMS, const char *cacheDirectory=0)=0
 Multiple results files can be summed together and treated as 'one'.
 
void clearAllErrors ()
 Remove all errors from the current list of errors.
 
void copyFrom (const ms_errors *right)
 Use this member to make a copy of another instance.
 
std::string get_ms_mascotresults_params (const ms_mascotoptions &opts, ms_mascotresults_params &params) const
 Return default flags and parameters for creating an ms_peptidesummary or ms_proteinsummary object.
 
std::string get_ms_mascotresults_params (const ms_mascotoptions &opts, unsigned int *gpFlags, double *gpMinProbability, int *gpMaxHitsToReport, double *gpIgnoreIonsScoreBelow, unsigned int *gpMinPepLenInPepSummary, bool *gpUsePeptideSummary, unsigned int *gpFlags2) const
 [Deprecated] Return default flags and parameters for creating an ms_peptidesummary or ms_proteinsummary object.
 
virtual std::string getCacheDirectory (bool processed=true) const =0
 Returns the directory being used for cache files (if any).
 
virtual std::string getCacheFileName () const =0
 Returns the filename of the cache file; see ms_mascotresfile_msr::getCacheFileName() and ms_mascotresfile_dat::getCacheFileName().
 
bool getCrosslinkingMethod (ms_crosslinking_method *method) const
 Return the crosslinking method object from the embedded crosslinking file.
 
virtual DATABASE_TYPE getDatabaseType (const int idx) const =0
 Return database type if available.
 
virtual int getDate () const =0
 Returns the date and time of the search in seconds since midnight January 1st 1970.
 
virtual ms_mascotoptions::DECOY_ALGORITHM getDecoyTypeForDB (const int idx=1) const =0
 Returns the decoy algorithm type for a given database.
 
virtual bool getEnzyme (ms_enzymefile *efile, const char *enzymeFileName=0) const =0
 Returns an object that represents the embedded enzyme file as a reduced enzymes file.
 
const ms_errsgetErrorHandler () const
 Retrive the error object using this function to get access to all errors and error parameters.
 
int getErrorNumber (const int num=-1) const
 Return a specific error number - or ms_errs::ERR_NO_ERROR.
 
std::string getErrorString (const int num) const
 Return a specific error as a string.
 
virtual int getExecTime () const =0
 Returns the time taken for the search.
 
virtual std::string getFastaPath (int idx=1) const =0
 Returns the path to the FASTA file used.
 
virtual std::string getFastaVer (int idx=1) const =0
 Returns the FASTA file version.
 
virtual std::string getFileName (const int id=1) const =0
 Returns the name of the results file passed into the constructor.
 
virtual double getFirstPassThreshold () const =0
 Return the threshold value for the first pass of an automated error tolerant search.
 
virtual void getHeaderKeyValues (std::vector< std::string > &keys, std::vector< std::string > &values) const =0
 Return all the header key-value pairs.
 
virtual std::string getHeaderValue (const std::string &key) const =0
 Return the header value for the given key.
 
virtual ms_inputquery getInputQuery (const int queryNum) const =0
 Return the ms_inputquery object for the query given as argument.
 
virtual int getJobNumber (const int resfileID=1) const =0
 Return the job number for this file - obtained from the file name.
 
void getKeepAlive (KA_TASK &kaTask, int &kaPercentage, std::string &kaAccession, int &kaHit, int &kaQuery, std::string &kaText) const
 Return the progress indicators used by the keepAlive functions.
 
int getLastError () const
 Return the last error number - or ms_erros::ERR_NO_ERROR.
 
std::string getLastErrorString () const
 Return the last error number - or an empty string.
 
virtual int getLibraryMods (std::vector< std::string > &modNames, std::vector< double > &modDeltas) const =0
 Return all the library mod names and deltas.
 
virtual std::string getMascotVer () const =0
 Returns the version of Mascot used to perform the search.
 
bool getMasses (ms_masses *masses) const
 Returns an ms_masses object from the mass values in the results file.
 
virtual void getMassesKeyValues (std::vector< std::string > &keys, std::vector< double > &values) const =0
 Return all the residue and modification masses as key-value pairs.
 
virtual double getMassValue (const std::string &key) const =0
 Return the residue or modification mass for the given key.
 
const ms_modificationgetMonoLinkModification (const int modNum, const int monoLink) const
 Returns an ms_modification object that represents a monolink variable modification.
 
std::string getMSParserVersion () const
 Returns the version number of the Mascot Parser library.
 
virtual int getMultiFileQueryNumber (const int localQuery, const int fileId) const =0
 Return the multi-file query number from the local query number in an appended file.
 
int getNumberOfErrors () const
 Return the number of errors since the last call to clearAllErrors.
 
virtual int getNumberOfResfiles () const =0
 Multiple results files can be summed together and treated as 'one'.
 
virtual int getNumEtSeqsSearched (const int idx=0) const =0
 Returns the number of sequences searched in the second pass of an integrated error tolerant search.
 
virtual int getNumLibraryEntries (const int idx=0) const =0
 Returns the number of entries in the spectral library searched.
 
virtual int getNumQueries (const int resfileID=0) const =0
 Returns the number of queries (peptide masses or ms-ms spectra).
 
virtual double getNumResidues (const int idx=0) const =0
 Returns the number of residues in the FASTA file(s) searched.
 
virtual int getNumSeqs (const int idx=0) const =0
 Returns the number of sequences in the FASTA file(s) searched.
 
virtual int getNumSeqsAfterTax (const int idx=0) const =0
 Returns the number of sequences that passed the taxonomy filter in the FASTA file(s) searched.
 
virtual int getObservedCharge (const int query, const bool decoy=false) const =0
 The 'charge' returned will be 0 for Mr, otherwise it will be 1, -1, 2, -2, 3, -3 etc. and -100 for an error.
 
virtual double getObservedIntensity (const int query) const =0
 Returns the experimental intensity for the peptide.
 
virtual double getObservedMass (const int query) const =0
 Returns the experimental mass value as entered by the user.
 
virtual double getObservedMrValue (const int query, const bool decoy=false) const =0
 Returns the experimental mass value (as a relative mass) as entered by the user.
 
std::vector< std::string > getPercolatorFileNames () const
 Retrieve the filenames use for percolator input and output.
 
ms_progress_infogetProgressInfo (bool forPeptideSummary=false) const
 If a matrix_science::ms_progress_info object is passed to the constructor, this is returned here.
 
virtual int64_t getQmatch (const int query, const ms_peptide::PSM_TYPE pepType) const =0
 Return the number of peptide masses within precursor tolerance of this query.
 
virtual double getQplughole (const int query, const ms_peptide::PSM_TYPE pepType) const =0
 Return the threshold score for homologous peptide match (MIS only).
 
virtual bool getQuantitation (ms_quant_configfile *qfile) const =0
 Returns an object that represents the embedded quantitation file as a reduced quantitation.xml file.
 
bool getQuantitationMethod (ms_quant_method *qmethod) const
 Return the quantitation method object from the embedded quantitation file.
 
virtual int getReferenceDatabaseNumberOfSL (const int idx) const =0
 Return the database number of the reference database of a spectral library.
 
virtual std::string getRepeatSearchString (const int query, const bool fullQuery=false) const =0
 To perform a repeat search need to build up appropriate string.
 
virtual const ms_mascotresfilebasegetResfile (int id) const =0
 
virtual std::string getSearchParameter (const std::string &key) const =0
 Return the search parameter the given key.
 
virtual void getSearchParametersKeyValues (std::vector< std::string > &keys, std::vector< std::string > &values) const =0
 Return all the search parameters as key-value pairs.
 
virtual std::vector< int > getSLDatabaseNumbersOfReference (const int idx) const =0
 Return the database numbers of the spectral libraries whose reference database is at the given index.
 
virtual std::string getSLExecCommand (int idx=1) const =0
 Returns the library search command line and parameters (sl_exec_command).
 
virtual double getSLFragmentTolerance (int idx=1) const =0
 Returns the effective spectral library fragment tolerance.
 
virtual std::string getSLFragmentToleranceUnit (int idx=1) const =0
 Returns the unit of the effective spectral library fragment tolerance.
 
virtual bool getSrcQueryAndFileIdForMultiFile (const int q, int &gsqNewQuery, int &gsqFileId) const =0
 Return the query number and file ID in the source results file.
 
virtual bool getTaxonomy (ms_taxonomyfile *tfile) const =0
 Returns an object that represents the embedded taxonomy file as a reduced taxonomy file.
 
virtual bool getUnimod (ms_umod_configfile *ufile, bool useSchemaFromResfile=false) const =0
 Returns an object that represents the embedded unimod file as a reduced unimod_2.xml file.
 
virtual bool getUnimodXL (ms_umod_configfile *ufile, bool useSchemaFromResfile=false) const =0
 Returns an object that represents the embedded unimod_xl file as a reduced unimod_xl.xml file.
 
virtual std::string getUniqueTaskID () const =0
 Returns the unique task ID used by Mascot Daemon.
 
std::string getXMLschemaFilePath (XML_SCHEMA XMLschema) const
 Gets the XML schema to be used by functions using quantitation or unimod.
 
virtual bool hasEnzyme () const =0
 Return true if the results file contains information about the enzyme used.
 
virtual bool hasQuantitation () const =0
 Return true if the results file contains quantitation data.
 
virtual bool hasRT () const =0
 Return true if the results file contains retention time data.
 
virtual bool isDatabaseTypeAvailable () const =0
 Check whether database types are available.
 
virtual bool isErrorTolerant () const =0
 Returns true if the search was an error tolerant search.
 
virtual bool isMSMS () const =0
 Returns true if the search was an MSMS search (SEARCH=MIS).
 
virtual bool isPMF () const =0
 Returns true if the search was a PMF search (SEARCH=PMF).
 
virtual bool isSQ () const =0
 Returns true if the search was a sequence query search (SEARCH=SQ).
 
bool isValid () const
 Call this function to determine if there have been any errors.
 
bool outputKeepAlive () const
 Outputs the "keep-alive" string during time-consuming operations.
 
ms_searchparamsparams () const
 Returns a reference to the search parameters class.
 
void resetKeepAlive (const int keepAliveInterval, const char *keepAliveText, const bool propagateToAppended=true, const bool resetStartTime=false)
 Replace the existing keepAlive values with new values.
 
void setPercolatorFeatures (const char *percolatorFeatures, const char *additionalFeatures, const bool useRetentionTimes)
 Set Percolator features before creating an ms_peptidesummary with Percolator scoring (deprecated).
 
void setPercolatorFeatures (const ms_mascotoptions &options, const char *additionalFeatures, const std::vector< std::string > &adapterParameters=std::vector< std::string >())
 Set Percolator features before creating an ms_peptidesummary with Percolator scoring.
 
bool setXMLschemaFilePath (XML_SCHEMA XMLschema, const char *path)
 Sets the XML schema to be used by functions using quantitation or unimod.
 
bool versionGreaterOrEqual (int major, int minor, int revision) const
 Compare the value returned by getMascotVer() with the passed version number.
 

Static Public Member Functions

static std::unique_ptr< ms_mascotresfilebasecreateResfile (const char *szFileName, const int keepAliveInterval=0, const char *keepAliveText="<!-- %d seconds -->\n", const unsigned int flags=matrix_science::ms_mascotresfilebase::RESFILE_NOFLAG, const char *cacheDirectory="../data/cache/%Y/%m", const char *XMLschemaDirectory=0, matrix_science::ms_progress_info *progressMonitor=0)
 Return a new ms_mascotresfile_msr or ms_mascotresfile_dat based on the file contents.
 
static RESFILE_TYPE resfileType (const std::string &fileName)
 Return the results format of the file provided as an argument.
 
static bool staticGetPercolatorFileNames (const char *szFileName, const char *cacheDirectory, const char *percolatorFeatures, const char *additionalFeatures, const bool useRetentionTimes, std::vector< std::string > &filenames, std::vector< bool > &exists)
 Returns a list of the Percolator input and output files for the specified data file (deprecated).
 
static bool staticGetPercolatorFileNames (const char *szFileName, const char *cacheDirectory, const ms_mascotoptions &options, const char *additionalFeatures, const std::vector< std::string > &adapterParameters, std::vector< std::string > &filenames, std::vector< bool > &exists)
 Returns a list of the Percolator input and output files for the specified data file.
 
static bool willCreateCache (const char *szFileName, const ms_mascotoptions &opts, const char *applicationName, std::string &resfileCacheFileName, unsigned int &cacheStatus)
 Returns true if a cache file will be created when the ms_mascotresfile_dat constructor is called.
 
static bool willCreateCache (const char *szFileName, const unsigned int flags, const char *cacheDirectory, std::string *cacheFileName)
 Returns true if a cache file will be created when the ms_mascotresfile_dat constructor is called.
 

Protected Member Functions

virtual bool getCrosslinking (ms_crosslinking_configfile *crosslinkingFile) const =0
 
std::string getErrorInfoAsString (const int num) const
 
bool setErrorInfoFromString (const std::string &e)
 

Detailed Description

Abstract base class of ms_mascotresfile_dat and ms_mascotresfile_msr.

Until Mascot Server 2.8, there was only one Mascot results file format: plain text MIME format file with .dat extension. Mascot Server 3.0 introduced a new file format, Mascot Search Results (MSR), which is an SQLite database. The old .dat format is frozen and now referred to as dat28.

ms_mascotresfilebase has two concrete implementations.

Use ms_mascotresfilebase::createResfile() to create an object using the right constructor. Almost all of the API is the same between the two classes, except for very low-level methods specific to the dat28 format.

Examples
peptide_list.cpp, repeat_search.cpp, resfile_error.cpp, resfile_info.cpp, resfile_input.cpp, resfile_params.cpp, and resfile_summary.cpp.

Member Enumeration Documentation

◆ FLAGS

enum FLAGS

Flags for opening the results file.

See Using enumerated values and static const ints in Perl, Java, Python and C# and Caching Mascot Results.

Enumerator
RESFILE_NOFLAG 

Dat28 format: Read the whole file into memory. MSR format: Use standard SQLite methods to read the file with low memory overhead.

RESFILE_USE_CACHE 

Dat28 format: Create the resfile cache if it doesn't already exist. Use the cache rather than reading the whole .dat file into memory. MSR format: this flag is ignored.

RESFILE_CACHE_IGNORE_ACC_DUPES 

When creating a cache file, don't check for duplicate accessions in the SEC_PROTEINS and SEC_DECOYPROTEINS sections which can save some time. Strongly recommend that this flag is never used unless performance becomes a real issue and it is known that ms_mascotoptions::getIgnoreDupeAccession was not defined for the relevant database(s) when they were compressed.

RESFILE_USE_PARENT_PARAMS 

For use when Combining multiple results files. The flags and parameters are then inherited from the parent search.

RESFILE_CACHE_IGNORE_DATE_CHANGE 

Dat28 format: Opening the resfile cache CDB file should ignore the last modified timestamp on the .dat file. MSR format: this flag is ignored.

◆ KA_TASK

enum KA_TASK

Processing some results files is computationally intensive. These are the tasks that can be performed.

See Using enumerated values and static const ints in Perl, Java, Python and C#.

Used with getKeepAlive(), but also see outputKeepAlive()

Enumerator
KA_CREATEINDEX_CI 

Creating a cache file when Using the resfile cache (dat28 format only) in dat28 format.

KA_READFILE_RF 

Reading the results file into memory when not using a cache.

KA_ASSIGNPROTEINS_AP 

Assigning peptides to proteins to get a list of all possible proteins.

KA_GROUPPROTEINS_GP 

Grouping proteins using ms_mascotresults::MSRES_GROUP_PROTEINS or ms_mascotresults::MSRES_CLUSTER_PROTEINS.

KA_UNASSIGNEDLIST_UL 

Creating the unassigned list - see ms_mascotresults::createUnassignedList.

KA_QUANTITATION 

Calculating quantitation values for reporter and multiplex protocols.

KA_CREATECACHE_CC 

Creating a cache file when Using the pepsum cache (MSR and dat28).

KA_THRESHFORFDR_FDR 

Calls to ms_mascotresults::getThresholdForFDRAboveHomology can be slow.

KA_LAST 

Placeholder that is equal to the number of possible tasks.

◆ PERCOLATOR_FILE_NAMES

Offsets into a vector of Percolator filenames.

See Using enumerated values and static const ints in Perl, Java, Python and C#.

Used with getPercolatorFileNames().

Enumerator
PERCOLATOR_INPUT_FILE 

This file is created by ms-createpip.exe and read by Percolator.

PERCOLATOR_OUTPUT_TARGET 

From std::out of percolator.exe.

PERCOLATOR_OUTPUT_DECOY 

Specified using the -B flag when calling Percolator.

◆ RESFILE_TYPE

Supported results file formats.

Enumerator
RESFILE_DAT28 

Plain text MIME-format results file produced by Mascot Server 2.0 to 2.8.

RESFILE_MSR 

Mascot Search Results (MSR) file introduced in Mascot Server 3.0.

RESFILE_UNKNOWN 

Unknown file format.

◆ XML_SCHEMA

enum XML_SCHEMA

The results file contains embedded files in XML format and these need to be validated against a schema.

This is the list of schema that can be set using setXMLschemaFilePath() and getXMLschemaFilePath()

Enumerator
XML_SCHEMA_QUANTITATION 

From the embedded quantation file. Valid aliases are: "http://www.matrixscience.com/xmlns/schema/quantitation_1" and "http://www.matrixscience.com/xmlns/schema/quantitation_2".

XML_SCHEMA_UNIMOD 

From the embedded unimod file. Valid alias is: http://www.unimod.org/xmlns/schema/unimod_2.

XML_SCHEMA_DIRECTORY 

From the value of XMLschemaDirectory passed into the ms_mascotresfilebase constructor.

XML_SCHEMA_CROSSLINKING 

From the embedded crosslinking file. Valid alias is: http://www.matrixscience.com/xmlns/schema/crosslinking_1.

XML_SCHEMA_LAST 

Placeholder that is equal to the number of possible schema.

Constructor & Destructor Documentation

◆ ms_mascotresfilebase()

ms_mascotresfilebase ( const char *  szFileName,
const int  keepAliveInterval = 0,
const char *  keepAliveText = "<!-- %d seconds -->\n",
const unsigned int  flags = RESFILE_NOFLAG,
const char *  cacheDirectory = 0,
const char *  XMLschemaDirectory = 0,
ms_progress_info progressMonitor = 0 
)

ms_mascotresfilebase is an abstract class; use createResfile().

See ms_mascotresfilebase::createResfile.

Parameters
szFileNameis the path to a valid Mascot results file
keepAliveIntervalis the interval in seconds between each time the keepAliveText is output to stdout.
keepAliveTextis output every keepAliveInterval seconds while the file is being loaded. See outputKeepAlive() for further details.
flagsare created by bitwise ORing the ms_mascotresfile_dat::FLAGS.
cacheDirectoryis the location where any cache files are stored.
XMLschemaDirectoryis the location where the xml schema files are located.
progressMonitoris an optional parameter that can be used to track progress of the creation of this object.

Member Function Documentation

◆ anyMSMS()

bool anyMSMS ( ) const
pure virtual

Returns true if any of the queries in the search contain ions data.

See also the isMSMS() member, although this function is the preferred one.

Returns
true if any queries contain MS-MS fragment masses

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ anyPMF()

bool anyPMF ( ) const
pure virtual

Returns true if any of the queries in the search just contain a single peptide mass.

See also the isPMF() member, although this function is the preferred one.

Returns
true if any queries contain no MS-MS values

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
repeat_search.cpp, and resfile_info.cpp.

◆ anySQ()

bool anySQ ( ) const
pure virtual

Returns true if any of the queries in the search contain seq or comp commands.

See also the isSQ() member, although this function is the preferred one.

Returns
true if any queries contain seq or comp commands

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ anyTag()

bool anyTag ( ) const
pure virtual

Returns true if any of the queries in the search contain tag or etag commands.

See also anySQ(), anyPMF() and anyMSMS()

Returns
true if any queries contain tag or etag commands

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ appendErrors()

void appendErrors ( const ms_errors src)
inherited

Copies all errors from another instance and appends them at the end of own list.

Parameters
srcThe object to copy the errors across from. See Maintaining object references: two rules of thumb.

◆ appendResfile()

int appendResfile ( const char *  filename,
int  flags = RESFILE_USE_PARENT_PARAMS,
const char *  cacheDirectory = 0 
)
pure virtual

Multiple results files can be summed together and treated as 'one'.

See Combining multiple results files.

Attempts to load the specifed results file and append to the existing file.

Any warning or error messages in the file are also appended to the existing object. If isValid() for the new file returns false, it is not appended and this function returns 0.

A merged report requires the results to be in the same file format. That is, a dat28 (.dat) can be appended to another dat28 file, and an MSR file (.msr) can be appended to another MSR file, but you cannot append an MSR file to a dat28 file or vice versa.

ms_errs::ERR_INVALID_RESFILE, ms_errs::ERR_READINGFILE or ms_errs::ERR_MSP_MSR_READING_FILE may be set if the file formats are not compatible.

ms_errs::ERR_CANNOT_APPEND_RESFILE_NO_FNAMES and ms_errs::ERR_CANNOT_APPEND_RESFILE will be set if the file cannot be appended because of different parameters, such as a different enzyme.

Parameters
filenameis the path to the results file to append.
flagsare one of the ms_mascotresfilebase::FLAGS. If RESFILE_USE_PARENT_PARAMS is specified, then the flags, keepAlive and cache directory are copied from the parent object.
cacheDirectoryis the directory for the cache files if RESFILE_USE_CACHE has been specified. If RESFILE_USE_PARENT_PARAMS is specified and cacheDirectory is null or an empty string, then the cache directory for the parent object is used.
Returns
0 on error, or the id number of the new file. The id can be used with getNumQueries(), getJobNumber(), getFileName() or getResfile(). The return value from the first successful call to this function will be 2.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ clearAllErrors()

void clearAllErrors ( )
inherited

Remove all errors from the current list of errors.

The list of 'errors' can include fatal errors, warning messages, information messages and different levels of debugging messages.

All messages are accumulated into a list in this object, until clearAllErrors() is called.

See Error Handling.

See also
isValid(), getLastError(), getLastErrorString(), getErrorHandler()
Examples
common_error.cpp, resfile_error.cpp, and resfile_summary.cpp.

◆ copyFrom()

void copyFrom ( const ms_errors right)
inherited

Use this member to make a copy of another instance.

Parameters
rightis the source to initialise from

◆ createResfile()

std::unique_ptr< ms_mascotresfilebase > createResfile ( const char *  szFileName,
const int  keepAliveInterval = 0,
const char *  keepAliveText = "<!-- %d seconds -->\n",
const unsigned int  flags = matrix_science::ms_mascotresfilebase::RESFILE_NOFLAG,
const char *  cacheDirectory = "../data/cache/%Y/%m",
const char *  XMLschemaDirectory = 0,
matrix_science::ms_progress_info progressMonitor = 0 
)
static

Return a new ms_mascotresfile_msr or ms_mascotresfile_dat based on the file contents.

The function 'sniffs' and detects the file contents using ms_mascotresfilebase::resfileType. If it looks like an MSR file, the function returns a new ms_mascotresfile_msr. If it looks like a dat28 (.dat) file, the function returns ms_mascotresfile_dat. Otherwise, a 'nil' object is returned which contains no data and is invalid.

The arguments are the same between the classes, but the details differ a bit. For example, ms_mascotresfile_msr doesn't need caching for fast random access, so it ignores the flags parameter. However, it's always safe to use the same flags regardless of class.

Please see the detailed class documentation:

Parameters
szFileNameis the path to a valid Mascot results file
keepAliveIntervalis the interval in seconds between each time the keepAliveText is output to stdout.
keepAliveTextis output every keepAliveInterval seconds while the file is being loaded. See outputKeepAlive() for further details.
flagsare created by bitwise ORing the ms_mascotresfile_dat::FLAGS.
cacheDirectoryis the location where any cache files are stored.
XMLschemaDirectoryis the location where the xml schema files are located.
progressMonitoris an optional parameter that can be used to track progress of the creation of this object.
Returns
ms_mascotresfile_msr or ms_mascotresfile_dat object, or a 'nil' object if the file type isn't supported.

◆ get_ms_mascotresults_params() [1/2]

std::string get_ms_mascotresults_params ( const ms_mascotoptions opts,
ms_mascotresults_params resParams 
) const

Return default flags and parameters for creating an ms_peptidesummary or ms_proteinsummary object.

A number of optional flags and parameters can be passed to the ms_proteinsummary or ms_peptide summary constructors. For an application or script running on the Mascot server, the default values for some of these parameters should normally be taken from the mascot.dat file. This function sets the values and flags required to pass to the constructor in the passed ms_mascotresults_params object.

Parameters
[in]optscontains the options stored in mascot.dat. Call the ms_datfile construction and then ms_datfile::getMascotOptions() to obtain this value.
[out]resParamsthe values and flags required to pass to the peptide or protein summary object are set to this ms_mascotresults_params object, overwriting any values which were already set.
Returns
If the search contains MS-MS data and the number of queries >= ms_mascotoptions::getProteinFamilySwitch, then the returned script name will be ms_mascotoptions::getResultsPerlScript_2, otherwise it will be ms_mascotoptions::getResultsPerlScript.
Examples
resfile_summary.cpp.

◆ get_ms_mascotresults_params() [2/2]

std::string get_ms_mascotresults_params ( const ms_mascotoptions opts,
unsigned int *  gpFlags,
double *  gpMinProbability,
int *  gpMaxHitsToReport,
double *  gpIgnoreIonsScoreBelow,
unsigned int *  gpMinPepLenInPepSummary,
bool *  gpUsePeptideSummary,
unsigned int *  gpFlags2 
) const

[Deprecated] Return default flags and parameters for creating an ms_peptidesummary or ms_proteinsummary object.

Deprecated:
This function is deprecated. All new code should use ms_mascotresfilebase::get_ms_mascotresults_params(const ms_mascotoptions &, ms_mascotresults_params &) instead, as all the parameters are encapsulated in the ms_mascotresults_params class.

A number of optional flags and parameters can be passed to the ms_proteinsummary or ms_peptide summary constructors. For an application or script running on the Mascot server, the default values for some of these parameters should normally be taken from the mascot.dat file. This function returns the values and flags required to pass to the constructor.

See Multiple return values in Perl, Java, Python and C#.

Parameters
[in]optscontains the options stored in mascot.dat. Call the ms_datfile construction and then ms_datfile::getMascotOptions() to obtain this value.
[out]gpFlagswill return the flags that are to be passed as the second parameter to the ms_proteinsummary or ms_peptidesummary object.
[out]gpMinProbabilityis the third parameter to be passed to the ms_proteinsummary or ms_peptidesummary objects. This return value will nomally be equal to the value returned from ms_mascotoptions::getSigThreshold().
[out]gpMaxHitsToReportthis return value will normally be the one returned by ms_searchparams::getREPORT().
[out]gpIgnoreIonsScoreBelowthis return value will be the one returned by ms_mascotoptions::getIgnoreIonsScoreBelow().
[out]gpMinPepLenInPepSummarythis return value will be the one returned by ms_mascotoptions::getMinPepLenInPepSummary.
[out]gpUsePeptideSummarywill be false is the file doesn't contain any anyMSMS (as returned by the anyMSMS() function) or any sequence tags (anyTag()). In this case, you should create an ms_proteinsummary. If gpUsePeptideSummary is true, you should create an ms_peptidesummary object.
[out]gpFlags2is only required for an ms_peptidesummary. If gpUsePeptideSummary is true, gpFlags2 will have the following bits set.
Returns
If the search contains MS-MS data and the number of queries >= ms_mascotoptions::getProteinFamilySwitch, then the returned script name will be ms_mascotoptions::getResultsPerlScript_2, otherwise it will be ms_mascotoptions::getResultsPerlScript.

◆ getCacheDirectory()

std::string getCacheDirectory ( bool  processed = true) const
pure virtual

Returns the directory being used for cache files (if any).

The cacheDirectory supplied to the constructor ms_mascotresfilebase::ms_mascotresfilebase may contain a number of '%' flags which get substituted by Mascot Parser.

This function returns either an absolute directory or a directory relative to the current working directory, depending on what was supplied and the parameter processed.

See Caching Mascot Results and ms_mascotoptions::getCacheDirectory

Parameters
processedif true (the default), then the returned directory is relative to the current directory and will have any '%' flags replaced with the relevant directory. If processed is false, then the directory returned will be identical to the one passed to the constructor.
Returns
The cache directory.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getCrosslinking()

bool getCrosslinking ( ms_crosslinking_configfile crosslinkingFile) const
protectedpure virtual

The return value indicates that the embedded crosslinking method file exists in the results file. Call ms_crosslinking_configfile::isValid to determine whether the XML part has been parsed successfully.

The contents of the file are validated against a schema by default.

See Object initialising functions in Perl, Java, Python and C#.

See also
getCrosslinkingMethod() which is generally easier to use.
Parameters
crosslinkingFilea pointer to crosslinking file object. This must be a valid pointer to a valid object, which should normally be created using the default constructor: ms_crosslinking_configfile::ms_crosslinking_configfile
Returns
true if embedded file exists and false if it doesn't

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getCrosslinkingMethod()

bool getCrosslinkingMethod ( ms_crosslinking_method method) const

Return the crosslinking method object from the embedded crosslinking file.

This method returns true if all of the following apply:

  • the embedded crosslinking file exists;
  • a valid schema file location has been set using the XMLschemaDirectory parameter in the constructor or the file exists in the default location;
  • the embedded crosslinking file validates against the schema;
  • the crosslinking method named in the CROSSLINKING parameter exists;
  • the crosslinking method can be loaded.

Otherwise the method returns false.

If the CROSSLINKING parameter is empty or equals "none", then the method simply returns false. Otherwise, on failure, the method sets the warning ms_errors::ERR_MSP_CROSSLINKING_FAILEDLOAD.

Parameters
methodA pointer to crosslinking method object. This must be a valid pointer to a valid object, which should normally be created using the default constructor ms_quant_method::ms_quant_method.
Returns
True if the method was loaded; false otherwise.

◆ getDatabaseType()

DATABASE_TYPE getDatabaseType ( const int  idx) const
pure virtual

Return database type if available.

In dat28 format, Mascot 2.6 and later save the type of the searched database(s) in the results file, as db_typeX= lines in the header section. These types are AA (amino acid), NA (nucleic acid) or SL (spectral library).

In MSR format, introduced in Mascot Server 3.0, the 'db_type' column is always present in the search__databases table.

The number of databases is ms_searchparams::getNumberOfDatabases(), so idx should be between 1 and getNumberOfDatabases().

Spectral libraries must have a reference database. If the reference database is not part of the actual search, protein accessions mapped to it have a database number above ms_searchparams::getNumberOfDatabases(). For example, if the search contains one AA database and one spectral library, getNumberOfDatabases() is 2 and the types returned by getDatabaseType() are AA (idx = 1) and SL (idx = 2). The reference database is at index 3 with type SLREF.

To find the number of the reference database of a spectral library, see ms_mascotresfilebase::geReferenceDatabaseNumberOfSL().

Parameters
idxindex of the database; must normally be between 1 and ms_searchparams::getNumberOfDatabases(), or a valid database number returned by ms_mascotresfilebase::getReferenceDatabaseNumberOfSL()
Returns
database type if available. If idx is out of range or isDatabaseTypeAvailable() is false, returns AA.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getDate()

virtual int getDate ( ) const
pure virtual

Returns the date and time of the search in seconds since midnight January 1st 1970.

In dat28 format, obtained from the date= line in the header section of the file.

In MSR format, obtained from the 'date' row in the search__header table.

Can be converted to day, month, year etc. using gmtime or similar functions.

Returns
the date and time of the search in seconds since midnight January 1st 1970.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ getDecoyTypeForDB()

ms_mascotoptions::DECOY_ALGORITHM getDecoyTypeForDB ( const int  idx = 1) const
pure virtual

Returns the decoy algorithm type for a given database.

In dat28 format, the decoy algorithm type is saved as decoy_type= or decoy_typeX= in the header section, depending on Mascot version.

In MSR format, the algorithm type is saved as the 'decoy_type' column in the search__databases table.

If idx = 1, the method returns the value of decoy_type=. If idx > 1, the method returns the corresponding decoy_typeX= line, or if one doesn't exist, falls back on decoy_type=.

If there is no suitable value in the file or idx is outside its range, the method returns ms_mascotoptions::DECOY_ALGORITHM_NONE.

Parameters
idxindex of the database from 1 to ms_searchparams::getNumberOfDatabases().
Returns
the decoy algorithm type, or ms_mascotoptions::DECOY_ALGORITHM_NONE.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getEnzyme()

bool getEnzyme ( ms_enzymefile efile,
const char *  enzymeFileName = 0 
) const
pure virtual

Returns an object that represents the embedded enzyme file as a reduced enzymes file.

In dat28 format, for data files created with Mascot 2.2 and later, the full definition of the enzyme used is included in the Mascot results file. For earlier versions of Mascot, just the name is recorded. This function attempts to read the definition from the results file. If the definition is not present in the results file and a path to the enzymes file has been passed, then this function reads the enzymes file and removes all entries from the list in memory apart from the one with the name specified in the results file.

In MSR format, the full definition of the enzyme used in the search is included in the Mascot results file.

To determine whether the content has been parsed successfully call ms_enzymefile::isValid.

See Object initialising functions in Perl, Java, Python and C#.

Parameters
efilea pointer to enzymes-file object that will accept the content from the embedded file or the extenal enzymes file if necessary. If successful, the enzyme itself can be retrieved by passing an index of zero to ms_enzymefile::getEnzymeByNumber()
enzymeFileNameis only used for results files prior to Mascot 2.2
Returns
True if the embedded file exists in the results file or if the enzyme specified in the results file can be loaded from the external enzymes file.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getErrorHandler()

const ms_errs * getErrorHandler ( ) const
inherited

Retrive the error object using this function to get access to all errors and error parameters.

See Error Handling.

Returns
Constant pointer to the error handler
See also
isValid(), getLastError(), getLastErrorString(), clearAllErrors(), getErrorHandler()
Examples
common_error.cpp, and http_helper_getstring.cpp.

◆ getErrorInfoAsString()

std::string getErrorInfoAsString ( const int  num) const
protected

For saving any errors in the .cdb file

Parameters
num1..getNumberOfErrors()
Returns
"[Error Number]:[Number of times]:[Error string]"

◆ getErrorNumber()

int getErrorNumber ( const int  num = -1) const

Return a specific error number - or ms_errs::ERR_NO_ERROR.

All errors are accumulated into a list in this object, until clearAllErrors() is called.

Errors in other classes are accumulated here. If, for example, there is an error when creating a peptide summary, the errors need to be accessed through this class.

See Error Handling.

See also
getNumberOfErrors(), clearAllErrors(), getErrorNumber(), getErrorString()

In Mascot Parser 2.5 and later, this is implemented by calling: ms_errs::getErrorNumber()

Parameters
numis the error number in the range 1..getNumberOfErrors(). Passing a value of -1 will return the last error, or ERR_NO_ERROR. If an invalid number is passed, ERR_NO_ERROR will be returned (and no error will be added to the list of errors!).
Returns
the error number
Examples
resfile_error.cpp, and resfile_summary.cpp.

◆ getErrorString()

std::string getErrorString ( const int  num) const

Return a specific error as a string.

All errors are accumulated into a list in this object, until clearAllErrors() is called. To return a particular error, call this function with a number 1..getNumberOfErrors(). Passing a value of -1 will return the last error, or an empty string. If an invalid number is passed an empty string will be returned (and no error will be added to the list of errors!).

Errors in other classes are accumulated here. If, for example, there is an error when creating a peptide summary, the errors need to be accessed through this class.

In Mascot Parser 2.5 and later, this is implemented by calling ms_errs::getErrorString but functionality is identical to previous versions.

See Error Handling.

See also
getNumberOfErrors(), clearAllErrors(), getErrorNumber(), getErrorString()
Parameters
num1 to number of errors, or -1
Returns
error string for the given number, or the last error for -1, empty string if none
Examples
resfile_error.cpp, and resfile_summary.cpp.

◆ getExecTime()

virtual int getExecTime ( ) const
pure virtual

Returns the time taken for the search.

In dat28 format, obtained from the exec_time= line in the header section.

In MSR format, obtained from the 'exec_time' row in the search__header table.

This is the 'wall clock' time, not the CPU time.

Returns
execution time in seconds

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ getFastaPath()

std::string getFastaPath ( int  idx = 1) const
pure virtual

Returns the path to the FASTA file used.

Available in Mascot 2.2 and later.

Mascot 2.6 and later support spectral libraries. Each spectral library must have a reference database into which found peptide sequences are mapped at the end of the search. The "effective" reference database could be one of the protein sequence databases searched, or it could be a separate database used only for lookup purposes. You can find the database number of the reference database with ms_mascotresfilebase::getReferenceDatabaseNumberOfSL(). If this is larger than getNumberOfDatabases(), the FASTA file path is obtained from the library_reference_fastafile line in the results file header.

Parameters
idxindex of the database from 1 to ms_searchparams::getNumberOfDatabases() for databases or libraries searched, or an index returned by ms_mascotresfilebase::getReferenceDatabaseNumberOfSL().
Returns
The full path to the database or library file used in the search.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getFastaVer()

std::string getFastaVer ( int  idx = 1) const
pure virtual

Returns the FASTA file version.

Mascot 2.6 and later support spectral libraries. Each spectral library must have a reference database into which found peptide sequences are mapped at the end of the search. The "effective" reference database could be one of the protein sequence databases searched, or it could be a separate database used only for lookup purposes. You can find the database number of the reference database with ms_mascotresfilebase::getReferenceDatabaseNumberOfSL(). If this is larger than getNumberOfDatabases(), the FASTA file version is obtained from the library_reference_release line in the results file header.

Parameters
idxindex of the database from 1 to ms_searchparams::getNumberOfDatabases() for databases or libraries searched, or an index returned by ms_mascotresfilebase::getReferenceDatabaseNumberOfSL().
Returns
File name of the database or library searched.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ getFileName()

std::string getFileName ( const int  id = 1) const
pure virtual

Returns the name of the results file passed into the constructor.

Parameters
ida 1 based index. Unless appendResfile() has been called, this value must be '1'.
Returns
the file name (or file path) of the results file.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
repeat_search.cpp.

◆ getFirstPassThreshold()

double getFirstPassThreshold ( ) const
pure virtual

Return the threshold value for the first pass of an automated error tolerant search.

Return the first pass threshold of an automated error tolerant search. This is the significance threshold used at the end of the first pass to select proteins for the second pass.

In Mascot Server 2.7 and earlier, the first pass threshold was always 0.05.

In Mascot Server 2.8 and later, you can specify a target FDR in the search form. If the search is an error tolerant target-decoy search with target FDR, the first pass threshold can differ from 0.05.

See Score thresholds and score filtering (Mascot Server 2.8 and later).

Returns
First pass threshold, or 0.0 if this is not an integrated error tolerant search.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getHeaderKeyValues()

void getHeaderKeyValues ( std::vector< std::string > &  keys,
std::vector< std::string > &  values 
) const
pure virtual

Return all the header key-value pairs.

Get all the key-value pairs of the results header. The header contains data such as: number of queries, exec_time (search duration), database types (AA, NA, SL) and task ID.

To get the value of an individual header key, use ms_mascotresfilebase::getHeaderValue().

Parameters
[out]keysA vector of non-empty keys.
[out]valuesA vector of strings (some may be empty) in the same order as keys.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getHeaderValue()

std::string getHeaderValue ( const std::string &  key) const
pure virtual

Return the header value for the given key.

Get the value associated with the input key in the results header. To get all the values, use ms_mascotresfilebase::getHeaderKeyValues().

Parameters
[in]keyA non-empty string.
Returns
the value associated to the key, or empty if the key does not exist.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getInputQuery()

ms_inputquery getInputQuery ( const int  queryNum) const
pure virtual

Return the ms_inputquery object for the query given as argument.

This method call is equivalent to creating an ms_inputquery object with the current results file and queryNum as parameter.

Parameters
queryNumis the query number
Returns
ms_inputquery object

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getJobNumber()

int getJobNumber ( const int  resfileID = 1) const
pure virtual

Return the job number for this file - obtained from the file name.

The library can only 'guess' at this since the value is not in the results file. To perform this function, it retrieves the job number from the file name, so be warned about changing file names. The function returns 0 if it cannot determine the job number.

Parameters
resfileIDis the the 1 based id of the results file. If multiple files have been merged together with appendResfile(), use the file ID returned by appendResfile() or getSrcFileIdForMultiFile() to access the job number of the appended files.
Returns
the derived job number.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getKeepAlive()

void getKeepAlive ( KA_TASK kaTask,
int &  kaPercentage,
std::string &  kaAccession,
int &  kaHit,
int &  kaQuery,
std::string &  kaText 
) const

Return the progress indicators used by the keepAlive functions.

See Multiple return values in Perl, Java, Python and C# although there may be issues with some languages and the kaTask parameter.

It is normally easier for client applications to call ms_mascotresults::getCreateSummaryProgress() or outputKeepAlive() than to call this function.

Parameters
kaTaskis the task currently being performed by Mascot Parser. If Parser is doing nothing, then this will be the last task that was completed and kaPercentage will be 100
kaPercentageis the percentage (0..100) complete for the current kaTask.
kaAccessionis the current 'accession' being processed. See outputKeepAlive() for details of which tasks set this value
kaHitis the current hit being processed. See outputKeepAlive() for details of which tasks set this value
kaQueryis the current 'query' being processed. See outputKeepAlive() for details of which tasks set this value
kaTextis the text that would be output by outputKeepAlive()

◆ getLastError()

int getLastError ( ) const

Return the last error number - or ms_erros::ERR_NO_ERROR.

Same as calling getErrorNumber() with -1 as a parameter.

Returns
error number
Examples
peptide_list.cpp, resfile_error.cpp, and resfile_summary.cpp.

◆ getLastErrorString()

std::string getLastErrorString ( ) const

Return the last error number - or an empty string.

Same as calling getErrorString() with -1 as a parameter.

Returns
last error string
Examples
peptide_list.cpp, repeat_search.cpp, resfile_error.cpp, resfile_info.cpp, resfile_input.cpp, and resfile_params.cpp.

◆ getLibraryMods()

int getLibraryMods ( std::vector< std::string > &  modNames,
std::vector< double > &  modDeltas 
) const
pure virtual

Return all the library mod names and deltas.

Get a list of all library modifications. If multiple spectral libraries were searched, this is the combined list of modifications.

Parameters
[out]modNamesLibrary modification names.
[out]modDeltasLibrary modification deltas.
Returns
the number of modifications, or 0 if there are none.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getMascotVer()

virtual std::string getMascotVer ( ) const
pure virtual

Returns the version of Mascot used to perform the search.

In dat28 format, obtained from the version= entry in the header section of the file.

In MSR format, obtained from the 'version' row in the search__header table.

See also
versionGreaterOrEqual
Returns
the version of Mascot used to perform the search.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ getMasses()

bool getMasses ( ms_masses masses) const

Returns an ms_masses object from the mass values in the results file.

For results files from Mascot 2.2 and later, this function simply calls

    if (getUnimod(&umodConfigFile) && umodConfigFile.isValid())
    {
        masses->copyFrom(&umodConfigFile);
    }

For earlier results files in dat28 format, it reads the residue mass values from the masses section of the file. Since this section either contains average or monoisotopic masses (but not both), the resulting ms_masses object will only have one set of masses.

See Object initialising functions in Perl, Java, Python and C#.

Parameters
massesa pointer to a valid masses object that will accept the content from the section.
Returns
true always.

◆ getMassesKeyValues()

void getMassesKeyValues ( std::vector< std::string > &  keys,
std::vector< double > &  values 
) const
pure virtual

Return all the residue and modification masses as key-value pairs.

Get all the modification and residue masses as key-value pairs. This includes:

  • Residue masses (A, B, C, D, etc.)
  • Terminus masses (N-term, C-term)
  • Element and atomic masses (H, O, electron, etc.)
  • Variable modification deltas (delta1, delta2, etc.)
  • Fixed modification deltas (FixedMod1, FixedMod2, etc.)
  • Fragment neutral loss (NeutralLoss1, NeutralLoss2, etc.)
  • Peptide neutral loss (PepNeutralLoss1, PepNeutralLoss2, etc.)
  • Required peptide neutral loss (ReqPepNeutralLoss1, ReqPepNeutralLoss2, etc.)
  • Ignore masses (Ignore1, Ignore2, etc.)

To get the value of an individual key, use ms_mascotresfilebase::getMassValue().

In general, it is easier to access these values through ms_searchparams.

Parameters
[out]keysA vector of non-empty keys.
[out]valuesA vector of doubles in the same order as keys.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getMassValue()

std::string getMassValue ( const std::string &  key) const
pure virtual

Return the residue or modification mass for the given key.

Get the value associated with the input key in the list of masses. To get all the values, use ms_mascotresfilebase::getMassKeyValues().

Parameters
[in]keyA non-empty string.
Returns
the value associated to the key, or 0.0 if the key does not exist.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getMonoLinkModification()

const ms_modification * getMonoLinkModification ( const int  modNum,
const int  monoLink 
) const

Returns an ms_modification object that represents a monolink variable modification.

The method performs the following steps:

The ms_modification object contains the following fields:

Parameters
modNumVariable mod index in range 1..32. Usually this comes from the peptide's variable mods string.
monoLinkIndex of the neutral loss element. Usually this comes from the peptide's monolink string.
Returns
a pointer, or null if this isn't a linker or the monolink index is invalid.

◆ getMSParserVersion()

std::string getMSParserVersion ( ) const

Returns the version number of the Mascot Parser library.

This version information is also available:

  • In Windows from the Mascot Parser DLL in the version 'tab'
  • From Perl using perl -Mmsparser -e "print msparser->VERSION()"
  • From Java using:
        JarFile jar = new JarFile(new File(jarName));
        Manifest jarManifest = jar.getManifest();
        Attributes mainAttributes = jarManifest.getMainAttributes();
        String version = (String) mainAttributes.get(Attributes.Name.IMPLEMENTATION_VERSION);
    
Returns
the version number as a string.

◆ getMultiFileQueryNumber()

int getMultiFileQueryNumber ( const int  localQuery,
const int  fileId 
) const
pure virtual

Return the multi-file query number from the local query number in an appended file.

See Multiple return values in Perl, Java, Python and C#.

Needs to be called on the 'primary' file object rather than on a ms_mascotresfile_dat or ms_mascotresfile_msr object returned by the getResfile() function.

See Combining multiple results files.

Example: Assume that the primary results file has 10 queries, file 2 has 20 queries and file 3 has 30 queries. This function will return the following:

localQueryfileIdreturned query
616
11221
13 31
See also
getSrcQueryAndFileIdForMultiFile() for the 'inverse' function
Parameters
localQueryis the query number which should be a value between 1 and getNumQueries() for the file specified by fileId.
fileIdis a 1 based index to the source file.
Returns
the query number that should be used for the primary file, or -1 if an invalid localQuery / fileId parameter combination is passed.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getNumberOfErrors()

int getNumberOfErrors ( ) const

Return the number of errors since the last call to clearAllErrors.

This will be zero if there has been no error.

All errors are accumulated into a list in this object, until clearAllErrors() is called.

Errors in other classes are accumulated here. If, for example, there is an error when creating a peptide summary, the errors need to be accessed through this class.

From version 2.5, implemented by calling getErrorHandler()->getNumberOfErrors()

See Error Handling.

See also
getNumberOfErrors(), clearAllErrors(), getErrorNumber(), getErrorString()
Returns
number of errors
Examples
resfile_error.cpp, and resfile_summary.cpp.

◆ getNumberOfResfiles()

int getNumberOfResfiles ( ) const
pure virtual

Multiple results files can be summed together and treated as 'one'.

See Combining multiple results files.

Thread safe
This method is safe to use from multiple threads. See also Using Parser in multithreaded applications.
Returns
the number of results files that have been combined together. If appendResfile() has not been called then this function will return a value of 1.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getNumEtSeqsSearched()

int getNumEtSeqsSearched ( const int  idx = 0) const
pure virtual

Returns the number of sequences searched in the second pass of an integrated error tolerant search.

In dat28 format, the value is obtained from the et_sequences= or et_sequencesX= line in the header section of the file.

In MSR format, the value is obtained from the 'et_sequences' column in the search__databases table.

See also
getNumSeqs(), getNumSeqsAfterTax() getNumResidues()

See Integrated error tolerant search . This function will return -1 for the Original error tolerant search and for searches prior to Mascot 2.4.1

Parameters
idxindex of the database. Specifying a value of 0 (the default) will return the sum of all sequences after taxonomy in all FASTA files searched. Otherwise, the value should be in the range from 1 to ms_searchparams::getNumberOfDatabases(). Specifying a value outside the range of 0..ms_searchparams::getNumberOfDatabases() will result in -1 being returned. If the database at index idx is a spectral library, the value returned is -1.
Returns
the number of sequences that were searched in the second pass of an Integrated error tolerant search

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getNumLibraryEntries()

int getNumLibraryEntries ( const int  idx = 0) const
pure virtual

Returns the number of entries in the spectral library searched.

In dat28 format, the value is obtained from the library_entriesX= line in the header section of the file. Spectral library support was added in Mascot 2.6.

In MSR format, the value is obtained by summing up the values of the 'entries' column in the search__spectral_libraries table.

Parameters
idxindex of the database. Specifying a value of 0 (the default) will return the sum of all library entries searched. Otherwise, the value should be in the range from 1 to ms_searchparams::getNumberOfDatabases(). Specifying a value outside the range of 0..ms_searchparams::getNumberOfDatabases() will result in -1 being returned. If the database at index idx is not a spectral library, the value returned is -1.
Returns
the number of entries in the spectral library.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getNumQueries()

int getNumQueries ( const int  resfileID = 0) const
pure virtual

Returns the number of queries (peptide masses or ms-ms spectra).

Obtained from the queries= line in the header section of the file (dat28 format) or by getting the maximum value of the 'query_id' column in the query__data table (MSR format).

Thread safe
This method is safe to use from multiple threads. See also Using Parser in multithreaded applications.
Parameters
resfileIDis the the 1 based id of the results file. When the default value of 0 is used for a single results file, this is number of queries in the file. For Combining multiple results files supplying a value of zero returns the total number of queries in all the results files that have been combined. Use a value of 1 to get the number of queries just in the first results file.
Returns
the number of queries or -1 if resfileID is invalid

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
peptide_list.cpp, repeat_search.cpp, resfile_error.cpp, and resfile_info.cpp.

◆ getNumResidues()

double getNumResidues ( const int  idx = 0) const
pure virtual

Returns the number of residues in the FASTA file(s) searched.

In dat28 format, the value is obtained from the residues= or residuesX= line in the header section of the file. Multiple FASTA file support was added in Mascot 2.3.

In MSR format, the value is obtained from the 'residues' column of the search__databases table.

See also
getNumSeqs(), getNumSeqsAfterTax()
Parameters
idxindex of the database. Specifying a value of 0 (the default) will return the sum of all sequences after taxonomy in all FASTA files searched. Otherwise, the value should be in the range from 1 to ms_searchparams::getNumberOfDatabases(). Specifying a value outside the range of 0..ms_searchparams::getNumberOfDatabases() will result in -1 being returned. If the database at index idx is a spectral library, the value returned is 0.
Returns
the number of residues in the FASTA file searched. The value is returned as a double, because this method was introduced with 32-bit versions of Parser. These were unable to return an integer larger than 32 bits.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ getNumSeqs()

int getNumSeqs ( const int  idx = 0) const
pure virtual

Returns the number of sequences in the FASTA file(s) searched.

In dat28 format, the value is obtained from the sequences= or sequencesX= in the header section of the file. Multiple FASTA file support was added in Mascot 2.3.

In MSR format, the value is obainted from the 'sequences' column in the search__databases table.

See also
getNumSeqsAfterTax(), getNumResidues(), getNumLibraryEntries()
Parameters
idxindex of the database. Specifying a value of 0 (the default) will return the sum of all sequences in all FASTA files searched. Otherwise, the value should be in the range from 1 to ms_searchparams::getNumberOfDatabases(). Specifying a value outside the range of 0..ms_searchparams::getNumberOfDatabases() will result in -1 being returned. If the database at index idx is a spectral library, the value returned is 0.
Returns
the number of sequences in the FASTA file(s).

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ getNumSeqsAfterTax()

int getNumSeqsAfterTax ( const int  idx = 0) const
pure virtual

Returns the number of sequences that passed the taxonomy filter in the FASTA file(s) searched.

In dat28 format, the value is obtained from the sequences_after_tax= or sequences_after_taxX= line in the header section of the file. Multiple FASTA file support was added in Mascot 2.3.

In MSR format, the value is obtained from the 'sequences_after_tax' column of the search__databases table.

See also
getNumSeqs(), getNumResidues(), getNumLibraryEntries()
Parameters
idxindex of the database. Specifying a value of 0 (the default) will return the sum of all sequences after taxonomy in all FASTA files searched. Otherwise, the value should be in the range from 1 to ms_searchparams::getNumberOfDatabases(). Specifying a value outside the range of 0..ms_searchparams::getNumberOfDatabases() will result in -1 being returned. If the database at index idx is a spectral library, the value returned is 0.
Returns
the number of sequences that passed the taxonomy filter in the FASTA file(s) searched.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ getObservedCharge()

int getObservedCharge ( const int  query,
const bool  decoy = false 
) const
pure virtual

The 'charge' returned will be 0 for Mr, otherwise it will be 1, -1, 2, -2, 3, -3 etc. and -100 for an error.

In dat28 format, this is obtained from the qexp[query] value. It will come from the SEC_SUMMARY section. If decoy is set to true, it is obtained from the SEC_DECOYSUMMARY section.

In MSR format, the observed charge is obtained from the query__summary table. If decoy is false, the value comes from the standard, target pass of the search; decoy is true, from the standard, decoy pass.

The 'charge' returned will be 0 for Mr, otherwise it will be 1, -1, 2, -2, 3, -3 etc. and -100 for an error.

Possible error values:

If an 'ambiguous' charge state is specified for the whole search or for a specific query, then Mascot just records matches for the highest scoring charge state, and it is this charge state that is returned from this function. For example, the search may have been performed with "2+, 3+ or 4+" and ms_inputquery::getCharge() will return "2+, 3+ or 4+". If the highest scoring peptide match for a particular query was to charge state 4+, then all top 10 matches for that query will be for 4+ and this function will return '4'.

It is therefore not impossible to get different charge values for standard peptide matches from the decoy and target passes of the results files.

In dat28 format, this method can only read charge from the 'summary' and 'decoy_summary' sections. In MSR format, this method can only read charge for standard target or decoy peptide matches. If the peptide match type is error tolerant or crosslink, it's best to use ms_peptide::getCharge(), as these can have different charge state from the standard peptide matches in this query.

The functions getObservedIntensity() and getObservedMass() do not require the decoy parameter as the values will be identical from target and decoy passes.

See also
getObservedMrValue
Parameters
queryis a number in the range 1..getNumQueries()
decoyshould be false for target and true for decoy matches.
Returns
the observed charge for the peptide.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_error.cpp.

◆ getObservedIntensity()

double getObservedIntensity ( const int  query) const
pure virtual

Returns the experimental intensity for the peptide.

In dat28 format, this is obtained from SEC_SUMMARY - qintensity[query]. In MSR format, from the query__summary table.

The observed intensity is not always available and does not need to be supplied by the end user.

Returns 0 if the query cannot be found and sets the error ms_errs::ERR_QUERYOUTOFRANGE.

Parameters
queryis a number in the range 1..getNumQueries()
Returns
0 if the intensity value is not available.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getObservedMass()

double getObservedMass ( const int  query) const
pure virtual

Returns the experimental mass value as entered by the user.

In dat28 format, this is obtained from SEC_SUMMARY - qexp[query].

In MSR format, this is obtained from the query__summary table.

Returns 0 if the value cannot be found and sets the error ms_errs::ERR_QUERYOUTOFRANGE.

Parameters
queryquery number
Returns
the observed peptide mass.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getObservedMrValue()

double getObservedMrValue ( const int  query,
const bool  decoy = false 
) const
pure virtual

Returns the experimental mass value (as a relative mass) as entered by the user.

In dat28 format, this is obtained from the qmass[query] value. It will come from the SEC_SUMMARY section unless decoy is set to true in which case it is obtained from the SEC_DECOYSUMMARY section.

In MSR format, the value is obtained from the query__summary table. If decoy is false, the value comes from the standard, target pass of the search; decoy is true, from the standard, decoy pass.

See also
getObservedCharge

Returns 0 if the value cannot be found and sets the error ms_errs::ERR_QUERYOUTOFRANGE.

Parameters
queryis a number in the range 1..getNumQueries()
decoyshould only be set to true if ms_searchparams::getDECOY returns true.
Returns
the relative mass of the observed peptide.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getPercolatorFileNames()

std::vector< std::string > getPercolatorFileNames ( ) const

Retrieve the filenames use for percolator input and output.

This function will return an empty vector unless setPercolatorFeatures() has been called beforehand.

The offsets into the return array are defined by ms_mascotresfilebase::PERCOLATOR_FILE_NAMES.

See Using STL vector classes vectori, vectord and VectorString in Perl, Java, Python and C#.

You can get the list of Percolator file names without creating an object by using matrix_science::ms_mascotresfilebase::staticGetPercolatorFileNames(const char*, const char*, const ms_mascotoptions&, const char *, const std::vector<std::string>&, std::vector<std::string>&, std::vector<bool>&).

Returns
the list of files.

◆ getProgressInfo()

ms_progress_info * getProgressInfo ( bool  forPeptideSummary = false) const

If a matrix_science::ms_progress_info object is passed to the constructor, this is returned here.

Parameters
forPeptideSummaryshould be set to true to get the progress object used when creating the ms_peptidesummary, or false to return the object used when creating the ms_mascotresfilebase object.
Returns
a pointer to the progress information. This will be null if no progressMonitor is passed in the constructor

◆ getQmatch()

int64_t getQmatch ( const int  query,
const ms_peptide::PSM_TYPE  pepType 
) const
pure virtual

Return the number of peptide masses within precursor tolerance of this query.

Return the 'qmatch' value of the query. This is the count of trials for the query, where a trial is a candidate peptide sequence + modifications whose mass is within precursor tolerance.

The count of trials could be different depending on the peptide type. For example, error tolerant peptide matches normally have much higher count of trials than peptides from the first pass of the search.

Note
This method used to be defined in ms_peptidesummary::getQmatch(). It was moved to ms_mascotresfilebase in Parser 3.0.
Parameters
queryQuery number, 1..getNumQueries().
pepTypePeptide match type.
Returns
qmatch value for the query and peptide match type.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getQplughole()

double getQplughole ( const int  query,
const ms_peptide::PSM_TYPE  pepType 
) const
pure virtual

Return the threshold score for homologous peptide match (MIS only).

Return the 'qplughole' value of the query. This is the critical value for calculating the homology threshold. The value could be different depending on peptide type, as the emprical score distribution within a query can be different between (for example) the first and second pass of an error tolerant search.

Parameters
queryQuery number, 1..getNumQueries().
pepTypePeptide match type.
Returns
qplughole value for the query and peptide match type.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getQuantitation()

bool getQuantitation ( ms_quant_configfile qfile) const
pure virtual

Returns an object that represents the embedded quantitation file as a reduced quantitation.xml file.

The return value indicates that the embedded quantitation file exists in the results file. Call ms_quant_configfile::isValid to determine whether the XML part has been parsed successfully.

For quantitation_2 and later, the contents of the file are validated against a schema by default. For quantitation_1, to explicitly validate against a schema, use ms_quant_configfile::setSchemaFileName() to choose a schema, and then use ms_quant_configfile::validateDocument() to validate.

See Object initialising functions in Perl, Java, Python and C#.

Deprecated:
See getQuantitationMethod() which is generally easier to use.
Parameters
qfilea pointer to quantitation file object. This must be a valid pointer to a valid object, which should normally be created using the default constructor: ms_quant_configfile::ms_quant_configfile
Returns
true if embedded file exists and false if it doesn't

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getQuantitationMethod()

bool getQuantitationMethod ( ms_quant_method qm) const

Return the quantitation method object from the embedded quantitation file.

This method returns true if all of the following apply:

  • the embedded quantitation file exists;
  • a valid schema file location has been set using the XMLschemaDirectory parameter in the constructor or the file exists in the default location;
  • the embedded quantitation file validates against the schema;
  • the quantitation method named in the QUANTITATION parameter exists;
  • the quantitation method can be loaded.

Otherwise the method returns false.

If the QUANTITATION parameter is empty or equals "none", then the method simply returns false. Otherwise, on failure, the method sets the warning ms_errors::ERR_MSP_QUANT_FAILEDLOAD.

Parameters
qmA pointer to quantitation method object. This must be a valid pointer to a valid object, which should normally be created using the default constructor ms_quant_method::ms_quant_method.
Returns
True if the method was loaded; false otherwise.

◆ getReferenceDatabaseNumberOfSL()

int getReferenceDatabaseNumberOfSL ( const int  idx) const
pure virtual

Return the database number of the reference database of a spectral library.

The reference database of a spectral library is either one of the databases searched – if the reference database was part of the actual search – or a virtual database whose number is above ms_searchparams::getNumberOfDatabases(). In the first case, getReferenceDatabaseNumberOfSL() returns a database number between 1 and ms_searchparams::getNumberOfDatabases(). In the second case, the number is above getNumberOfDatabases().

If idx does not refer to a spectral library, the method returns -1.

Parameters
idxDatabase number of the spectral library of interest, between 1 and ms_searchparams::getNumberOfDatabases() and whose type is SL (see getDatabaseType())
Returns
Database number of the reference database of the spectral library, or -1 if idx is not a spectral library.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getRepeatSearchString()

std::string getRepeatSearchString ( const int  query,
const bool  fullQuery = false 
) const
pure virtual

To perform a repeat search need to build up appropriate string.

If the 'fullQuery' parameter is false (the default) then the format will be:

mr from(observed, charge) query(querynum) etc. 

If the 'fullQuery' parameter is true then the format will be:

mr from(observer, charge1, charge2...) ions() etc. 

where 'etc' will be one or more of the following (split onto several lines here for readability):

 intensity(value) peptol(value, units) seq() comp()  tag() etag() 
 title() instrument() it_mods() index() rtinseconds() rtinseconds[x]() 
 scans scans[x]() rawscans rawscans[x]()
  • from contains the observed mass, and the charge found in the original search. If the fullQuery parameter is true, then the charge parameter may be a list of charge states to consider, in which case the mr value is for the first charge.
  • intensity will be returned if an intensity for the peptide was entered.
  • peptol will be returned if there is a specific tolerance override for this query. In dat28 format, it is taken from the PepTol= line in the queryx section of the results file. In MSR format, from the 'pep_tol' column in the query__data table.
  • seq() will be returned if there was a sequence query. It is the exact text that was entered by the user. In dat28 format, it comes from seq1= line in the queryx section of the results file. If there were multiple seq() commands in the original search, then these will be in the results file as seq1=, seq2= etc., and will be returned as seq(...) seq(...). In MSR format, the values are selected from the query__seq table.
  • comp() will be returned if there was a composition query. It is the exact text that was entered by the user. In dat28 format, it comes from comp1= line in the queryx section of the results file. If there were multiple comp() commands in the original search, then these will be in the results file as comp1=, comp2= etc., and will be returned as comp(...) comp(...). In MSR format, the values are selected from the query__comp table.
  • tag() will be returned if there was a tag query. It is the exact text that was entered by the user. In dat28 format, it comes from tag1=t line in the queryx section of the results file. If there were multiple tag() commands in the original search, then these will be in the results file as tag1=, tag2= etc., and will be returned as tag(...) tag(...). In MSR format, the values are selected from the query__seq_tag table.
  • etag() will be returned if there was a etag query. It is the exact text that was entered by the user. In dat28 format, it comes from tag1=e line in the queryx section of the results file. If there were multiple etag() commands in the original search, then these will be in the results file as tag1=, tag2= etc., and will be returned as etag(...) etag(...). In MSR format, the values are selected from the query__seq_tag table.
  • title will be returned if there was a title for the particular MS-MS spectrum. The title is returned as an escaped string.
  • instrument will be returned if there was an instrument defined at the query level in the original search. It is returned as an escaped string.
  • it_mods will be returned if modifications were defined at the query level in the original search. If there was more than modification, these will be returned as a comma separated list.
  • index will be returned if there was an index defined at the query level in the original search which will be the case for all searches in Mascot 2.3 and later. See ms_inputquery::getIndex.
  • scans or scans[x] will be returned if these were defined at the query level in the original search. They will have the format scans(29-34, 43) or scans[0](29-34) scans[1](43).
  • rtinseconds or rtinseconds[x] will be returned if these were defined at the query level in the original search. They will have the format rtinseconds(10-20, 25) or rtinseconds[0](10-20) rtinseconds[1](25).
  • rawscans or rawscans[x] will be returned if these were defined at the query level in the original search. They will have the format, for example, of: rawscans[0](pd0cy1ex1:pd0cy1ex3) rawscans[1](fn2ix1).
  • rawfile will be returned if these were defined at the query level in the original search. They will have the format, for example, of: rawfile(c:/data/rawfile.raw).
  • locus will be returned if these were defined at the query level in the original search. They will have the format, for example, of: locus(2.1.1.24.1).
  • query is only used for MS-MS spectra. To save data transfer with a repeat search, this command is used instead of resubmitting all the MS-MS ions values. When nph-mascot.exe comes to the query(x) command it gets the ions vales from the original .dat file
  • ions will be used to give a list of peaks.

Returns 0 if the value cannot be found and sets the error ms_errs::ERR_QUERYOUTOFRANGE.

See Automated repeating of searches.

Parameters
queryis a number in the range 1..getNumQueries()
fullQueryIf true, then a complete and self contained sequence query will be returned. See above for details
Returns
repeat search string

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
repeat_search.cpp, and resfile_input.cpp.

◆ getResfile()

const ms_mascotresfilebase * getResfile ( int  id) const
pure virtual

See Combining multiple results files and Maintaining object references: two rules of thumb.

Parameters
idis the the 1 based id of the results file and must be in the range 1..getNumberOfResfiles(). A value of 1 will (not particularly usefully!) return a pointer to the ms_mascotresfilebase that was originally created. A value of 2 will return ms_mascotresfilebase created in the first successful call to appendResfile(), and so on.
Returns
a pointer to the ms_mascotresfilebase object, or 0 if the parameter is invalid.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getSearchParameter()

std::string getSearchParameter ( const std::string &  key) const
pure virtual

Return the search parameter the given key.

Get the value associated with the input key in the search parameters. To get all the values, use ms_mascotresfilebase::getSearchParameterKeyValues().

Parameters
[in]keyA non-empty string.
Returns
the value associated to the key, or empty if the key does not exist.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getSearchParametersKeyValues()

void getSearchParametersKeyValues ( std::vector< std::string > &  keys,
std::vector< std::string > &  values 
) const
pure virtual

Return all the search parameters as key-value pairs.

Get all the search parameters as key-value pairs. Search parameters are more easily accessible through ms_searchparams. This low-level method is useful for creating repeat searches, where the search parameters should just be copied as is.

To get the value of an individual header key, use ms_mascotresfilebase::getSearchParameter().

Parameters
[out]keysA vector of non-empty keys.
[out]valuesA vector of strings (some may be empty) in the same order as keys.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getSLDatabaseNumbersOfReference()

std::vector< int > getSLDatabaseNumbersOfReference ( const int  idx) const
pure virtual

Return the database numbers of the spectral libraries whose reference database is at the given index.

The reference database of a spectral library is either one of the databases searched – if the reference database was part of the actual search – or a virtual database whose number is above ms_searchparams::getNumberOfDatabases(). The same database can be the reference database of multiple spectral libraries.

If idx does not refer to a reference database, the method returns the empty vector.

Parameters
idxDatabase number of the reference database of interest.
Returns
A vector of database numbers of spectral libraries whose reference database idx is. This vector can be empty.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getSLExecCommand()

std::string getSLExecCommand ( int  idx = 1) const
pure virtual

Returns the library search command line and parameters (sl_exec_command).

The command line for the library search is typically like:

    ../bin/NIST/mspepsearch/MSPepSearch.exe m a P /Z 0.1 /M 0.602993 /LIB [PATH_TO_MSP_FILE] /INP [PATH_TO_MSP_FILE] /OUTTAB [PATH_TO_TSV_FILE] /HITS 10 /MinMF 0 /NumCompared /OutPrecursorMz /OutDeltaPrecursorMz /OutSpecNum 

The exact contents depend on the file paths, fragment tolerance and library search options specified in mascot.dat.

In dat28 format, the command string is saved as sl_exec_commandX= line in the header section.

In MSR format, the command string is saved in the search__spectral_libraries table as 'sl_exec_command'.

If idx is outside its range, the method returns the empty string.

Parameters
idxindex of the database from 1 to ms_searchparams::getNumberOfDatabases().
Returns
the command line for the spectral library search of the given library.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getSLFragmentTolerance()

double getSLFragmentTolerance ( int  idx = 1) const
pure virtual

Returns the effective spectral library fragment tolerance.

When a search is run against a spectral library, the effective fragment tolerance is calculated from the user-configured library fragment tolerance and the tolerance specified in the search parameters. This may be different from ms_searchparams::getITOL() and ms_searchparams::getITOLU().

The tolerance unit can be retrieved with getSLFragmentToleranceUnit().

In dat28 format, in Mascot 2.6.01 and later, the effective tolerance value and unit are saved in the header section of the results file as sl_itolX=. For files created by Mascot 2.6.00, the value is parsed from the sl_exec_commandX= line if present.

In MSR format, the effective tolerance value and unit are saved in the search__spectral_libraries table as 'itol'.

If idx is outside its range, the method returns 0.0.

Parameters
idxindex of the database from 1 to ms_searchparams::getNumberOfDatabases().
Returns
the effective fragment tolerance.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getSLFragmentToleranceUnit()

std::string getSLFragmentToleranceUnit ( int  idx = 1) const
pure virtual

Returns the unit of the effective spectral library fragment tolerance.

When a search is run against a spectral library, the effective fragment tolerance is calculated from the user-configured library fragment tolerance and the tolerance specified in the search parameters. This may be different from ms_searchparams::getITOL() and ms_searchparams::getITOLU().

The tolerance can be retrieved with getSLFragmentTolerance().

In dat28 format, in Mascot 2.6.01 and later, the effective tolerance value and unit are saved in the header section of the results file as sl_itolX=. For files created by Mascot 2.6.00, the value is parsed from the sl_exec_commandX= line if present.

In MSR format, the effective tolerance value and unit are saved in the search__spectral_libraries table as 'itol_units'.

If idx is outside its range, the method returns 0.0.

Parameters
idxindex of the database from 1 to ms_searchparams::getNumberOfDatabases().
Returns
the tolerance unit.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getSrcQueryAndFileIdForMultiFile()

bool getSrcQueryAndFileIdForMultiFile ( const int  q,
int &  gsqNewQuery,
int &  gsqFileId 
) const
pure virtual

Return the query number and file ID in the source results file.

See Multiple return values in Perl, Java, Python and C#.

Useful for combining multiple results files (see Combining multiple results files) but also returns valid values for a single file.

Example: Assume that the primary results file has 10 queries, file 2 has 20 queries and file 3 has 30 queries. This function will return the following:

qnewQueryfileId
661
21112
3113
Thread safe
This method is safe to use from multiple threads. See also Using Parser in multithreaded applications.
See also
getMultiFileQueryNumber() for the 'inverse' function
Parameters
qis the query number which should be a value between 1 and getNumQueries().
gsqNewQueryis used to return the query number in the specified source file.
gsqFileIdis a 1 based index to the source file.
Returns
true if the query is in the range 1..getNumQueries().

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getTaxonomy()

bool getTaxonomy ( ms_taxonomyfile tfile) const
pure virtual

Returns an object that represents the embedded taxonomy file as a reduced taxonomy file.

The return value only indicates that the embedded file exists. If you want to find out whether the content has been parsed successfully, call the methods of matrix_science::ms_taxonomyfile.

See Object initialising functions in Perl, Java, Python and C#.

Parameters
tfilea pointer to taxonomy file object that will accept the content from the embedded taxonomy file.
Returns
true if the embedded taxonomy file was read and parsed.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getUnimod()

bool getUnimod ( ms_umod_configfile ufile,
bool  useSchemaFromResfile = false 
) const
pure virtual

Returns an object that represents the embedded unimod file as a reduced unimod_2.xml file.

The return value only indicates that the embedded Unimod XML exists in the results file. If you want to find out whether the XML part has been parsed successfully, call ms_umod_configfile::isValid

See Object initialising functions in Perl, Java, Python and C#.

Parameters
ufilea pointer to unimod file object that will accept the XML content.
useSchemaFromResfiledetermines where the location of the XML schema is defined. If 'true', then the schema location should have been defined by specifying XMLschemaDirectory in the constructor. If 'false', then ms_umod_configfile::setSchemaFileName must have been called on ufile before calling this function. This parameter was added in Parser 2.5, and the default value is false to ensure that it is backward compatible with previous versions.
Returns
true if embedded file exists and false if it doesn't.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getUnimodXL()

bool getUnimodXL ( ms_umod_configfile ufile,
bool  useSchemaFromResfile = false 
) const
pure virtual

Returns an object that represents the embedded unimod_xl file as a reduced unimod_xl.xml file.

The return value only indicates that the embedded Unimod crosslinking file exists in the results file. If you want to find out whether the XML part has been parsed successfully, call ms_umod_configfile::isValid

See Object initialising functions in Perl, Java, Python and C#.

Parameters
ufilea pointer to unimod file object that will accept the XML content.
useSchemaFromResfiledetermines where the location of the XML schema is defined. If 'true', then the schema location should have been defined by specifying XMLschemaDirectory in the constructor. If 'false', then ms_umod_configfile::setSchemaFileName must have been called on ufile before calling this function. getUnimodXL uses the same schema as ms_mascotresfilebase::getUnimod.
Returns
true if embedded file exists and false if it doesn't.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getUniqueTaskID()

std::string getUniqueTaskID ( ) const
pure virtual

Returns the unique task ID used by Mascot Daemon.

Although this value is a number, it is a 64 bit integer. Some languages on some platforms cannot deal with 64 bit integers properly, so the value is returned as a string. For searches with no taskid, an empty string is returned.

Returns
unique task id as a string

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ getXMLschemaFilePath()

std::string getXMLschemaFilePath ( XML_SCHEMA  XMLschema) const

Gets the XML schema to be used by functions using quantitation or unimod.

See also
setXMLschemaFilePath
Parameters
XMLschemaThe XML_SCHEMA enumeration value of the required xml schema file.
Returns
The requested xml schema value or an empty string if it has not been set.

◆ hasEnzyme()

bool hasEnzyme ( ) const
pure virtual

Return true if the results file contains information about the enzyme used.

Check whether the results file contains enzyme information as an embedded 'enzymes' file.

Returns
true if the file contains enzyme information. Always true for results from Mascot Server 2.2 and later.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ hasQuantitation()

bool hasQuantitation ( ) const
pure virtual

Return true if the results file contains quantitation data.

Check whether the results file contains an embedded quantitation method.

Returns
true if the file has an embedded quantitation method.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ hasRT()

bool hasRT ( ) const
pure virtual

Return true if the results file contains retention time data.

Check whether the first query has retention time stored in the RTINSECONDS field. If it does, return true. The implicit assumption is that all the other queries also have (or don't have) RTINSECONDS if the first query has (or doesn't have).

Returns
true if the file contains retention time data.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ isDatabaseTypeAvailable()

bool isDatabaseTypeAvailable ( ) const
pure virtual

Check whether database types are available.

In dat28 format, Mascot 2.6 and later save the type of the database(s) in the results file, as db_typeX= lines in the header section. If the types are not available, the database or databases searched could be AA or NA.

In MSR format, introduced in Mascot Server 3.0, the 'db_type' column is always present in the search__databases table.

Returns
Whether the db_typeX= lines exist or not in the dat28 results file. Always true in MSR format. false.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

◆ isErrorTolerant()

bool isErrorTolerant ( ) const
pure virtual

Returns true if the search was an error tolerant search.

In dat28 format, obtained from the ERRORTOLERANT value in the parameters section. In MSR format, obtained from the ERRORTOLERANT value in the search__parameters table.

In Mascot versions 1.8 and later, an error tolerant search can be run as a repeat search. In this case, one or more ACCESSIONs (which may be retrieved using ms_searchparams::getACCESSION ) must have been specified, and the results file will just contain the error tolerant search results. In Mascot 2.2 and later, a single search can be performed which contains both the standard search results and the error tolerant search results of automatically selected proteins. In this case, ms_searchparams::getACCESSION will return an empty string.

See Error tolerant searches.

See also
https://www.matrixscience.com/help/error_tolerant_help.html
Returns
true is the result file is from an error tolerant search, false otherwise

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ isMSMS()

bool isMSMS ( ) const
pure virtual

Returns true if the search was an MSMS search (SEARCH=MIS).

Since all types of search may be entered as a sequence query, it may be more useful to use the anyMSMS() member.

Returns
true if SEARCH=MIS in the parameters section or search__parameters table of the file.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
peptide_list.cpp, and resfile_info.cpp.

◆ isPMF()

bool isPMF ( ) const
pure virtual

Returns true if the search was a PMF search (SEARCH=PMF).

Since all types of search may be entered as a sequence query, it may be more useful to use the anyPMF() member.

Returns
true if SEARCH=PMF in the parameters section or search__parameters table of the file.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ isSQ()

bool isSQ ( ) const
pure virtual

Returns true if the search was a sequence query search (SEARCH=SQ).

Since all types of search may be entered as a sequence query, it may be more useful to use the anySQ() member.

Returns
true if SEARCH=SQ in the parameters section or search__parameters table of the file.

Implemented in ms_mascotresfile_dat, and ms_mascotresfile_msr.

Examples
resfile_info.cpp.

◆ isValid()

bool isValid ( ) const
inherited

◆ outputKeepAlive()

bool outputKeepAlive ( ) const

Outputs the "keep-alive" string during time-consuming operations.

For HTML reports with large result files it is sometimes necessary to output HTML comments to keep the connection alive. This can be done by specifying the interval at which the text is output (keepAliveInterval) and the text that should be output (keepAliveText) as parameters to the ms_mascotresfilebase::ms_mascotresfilebase() constructor.

If the value of keepAliveInterval passed to ms_mascotresfilebase::ms_mascotresfilebase is not 0, then the text specfied by keepAliveText will be output approximately every keepAliveInterval seconds. A 'd' in the keepAliveText will be replaced by the number of seconds since the process started.

This functionality is implemented by calling this function 'often', rather than by using a separate thread. This means that the times between calls will not be accurate. A computationally intensive application that uses Mascot Parser can also call this function as required.

From version 2.3 onwards, the keepAliveText can contain tags that allow different text to be output for different lengthy tasks. The tags are:

The text can also include the following tags which are substituted by values:

  • %d - The time elapsed since the initial call to ms-mascotresfile.
  • %p - The percentage complete for the current task.
  • %h - The hit number.
  • %q - The query number.
  • %a - The accession string for the current protein being processed.
  • %f - The filename of the .dat file being processed. May be useful for the ci= value with Combining multiple results files

The following table indicates which values are available for which tasks:

  ci= rf= ap= gp= ul= cc= fd=
%d X X X X X X X
%p X X X X X X X
%h     X X   X  
%q     X X X X  
%a     X X   X  
%f X X X X X X  

The %a and %h values for cc= are not output for the second half of caching.

A 'complete' example string might be: ' ul=Creating unassigned list (%p% complete)\n qu=Calculating quantitation component intensities (%p% complete)\ ci=Creating cache file (%p% complete)\n rf=Reading results file (%p% complete)\n ap=Assigning peptides to proteins (%p% complete) hit=%h, time=%d\n gp=Found protein group: %a, hit=%h, %p% complete, %d seconds\n cc=Caching results (%p% complete)\n fd=Calculating false discovery rate (%p% complete)\n '

Any text before the first tag will be used as a default for cases where text isn't supplied for a particular task. For example:

  Processing: %p% complete gp=Grouping %a

would output the text:

  Processing 23% complete

for all tasks except the protein grouping which would output:

  Grouping gi|12345

Returns
true if the work is complete, false otherwise

◆ params()

ms_searchparams & params ( ) const
inline

Returns a reference to the search parameters class.

For C# only, params is a keyword, so this function is renamed to _params.

Returns
reference to the ms_searchparams object

◆ resetKeepAlive()

void resetKeepAlive ( const int  keepAliveInterval,
const char *  keepAliveText,
const bool  propagateToAppended = true,
const bool  resetStartTime = false 
)

Replace the existing keepAlive values with new values.

KeepAlive values are passed when creating the ms_mascotresfilebase object, but it can be useful to change these at a later time.

See outputKeepAlive() for further details.

Parameters
keepAliveIntervalis the new interval in seconds. Specify a value of -1 to keep the old value, or 0 to stop outputting keepAliveText.
keepAliveTextis the text to output every keepAliveInterval seconds while the file is being loaded, or while other tasks are in progress.
propagateToAppendedonly has meaning if additional files have been appended by calling appendResfile().
resetStartTimecan be set to true to reset the "%d" value to zero. See outputKeepAlive() for details.

◆ resfileType()

ms_mascotresfilebase::RESFILE_TYPE resfileType ( const std::string &  fileName)
static

Return the results format of the file provided as an argument.

This method tries to open the file and 'sniff' the first few bytes. If those bytes are the SQLite database header, then this is a Mascot Search Results (MSR) file and the method returns RESFILE_MSR. If the bytes look like a MIME format header, then this is a dat28 (.dat) file and the method returns RESFILE_DAT28. In any other case, the method returns RESFILE_UNKNOWN.

Parameters
fileNamerelative or absolute path to the file to 'sniff'.
Returns
the file type.

◆ setErrorInfoFromString()

bool setErrorInfoFromString ( const std::string &  e)
protected

For restoring any errors in the .cdb file

Parameters
eis the stored string obtained from calling getErrorInfoAsString
Returns
true if the input string is parsed successfully

◆ setPercolatorFeatures() [1/2]

void setPercolatorFeatures ( const char *  percolatorFeatures,
const char *  additionalFeatures,
const bool  useRetentionTimes 
)

Set Percolator features before creating an ms_peptidesummary with Percolator scoring (deprecated).

Deprecated:
This function is deprecated and has been replaced with matrix_science::ms_mascotresfilebase::setPercolatorFeatures(const ms_mascotoptions&, const char*, const std::vector<std::string>&).

The names (and contents!) of the Percolator files depend on the features that have been enabled. When running from a report script on Mascot Server, turning on an additional feature in percolatorFeatures will cause Mascot to create a new file, but the old file will still be available if the feature is removed again from percolatorFeatures.

When running outside Mascot Server, the parameters passed to this method must match the parameters used by the report scripts, in order to get the same Percolator file names. If the pip/pop files are from Mascot Server 2.8 or later, please use matrix_science::ms_mascotresfilebase::setPercolatorFeatures(const ms_mascotoptions&, const char*, const std::vector<std::string>&) and the same mascot.dat options as were used on the server.

Parameters
percolatorFeaturesis normally retrieved by calling ms_mascotoptions::getPercolatorFeatures().
additionalFeaturesis normally a string passed to ms-createpip.exe. For example, "-a numUniqPeps -r varmods" would add numUniqPeps and remove varmods from the default. An empty string means nothing is added or removed from ms_mascotoptions::getPercolatorFeatures().
useRetentionTimesis a flag to indicate whether retention time information is used by percolator.exe. This value is normally retrieved by calling ms_mascotoptions::isPercolatorUseRT().

◆ setPercolatorFeatures() [2/2]

void setPercolatorFeatures ( const ms_mascotoptions options,
const char *  additionalFeatures,
const std::vector< std::string > &  adapterParameters = std::vector<std::string>() 
)

Set Percolator features before creating an ms_peptidesummary with Percolator scoring.

The names (and contents!) of the Percolator files depend on the features that have been enabled. When running from a report script on Mascot Server, turning on an additional feature in PercolatorFeatures will cause Mascot to create a new file, but the old file will still be available if the feature is removed again from PercolatorFeatures.

The method uses the following fields from options:

  • PercolatorFeatures (string)
  • PercolatorUseRT (true/false)
  • PercolatorUseProteins (true/false)
  • PercolatorExeFlags (string)
  • PercolatorTargetRankScoreThreshold (int)
  • PercolatorTargetRankRelativeThreshold (double)

When running outside Mascot Server, the parameters passed to this method must match the parameters used by the report scripts, in order to get the same Percolator file names. If the pip/pop files are from Mascot Server 2.8 or later, please use the same mascot.dat options as were used on the server. Make sure you also use the same adapterParameters.

Parameters
optionscontains the options stored in mascot.dat.
additionalFeaturesis normally a string passed to ms-createpip.exe For example, "-a numUniqPeps -r varmods" would add numUniqPeps and remove varmods from the default
adapterParametersis a vector of parameters to ML adapters introduced in Mascot Server 3.0. If the vector is not empty, then the parameters are sorted and hashed into an additional MD5 checksum component of the pip/pop file names.

◆ setXMLschemaFilePath()

bool setXMLschemaFilePath ( XML_SCHEMA  XMLschema,
const char *  path 
)

Sets the XML schema to be used by functions using quantitation or unimod.

It is generally easier to pass a directory as the XMLschemaDirectory parameter to the constructor rather than calling this function for each of the required schema.

Warning
if a relative path for the xsd is specified, then this will be relative to the document and not relative to the current working directory.

Example:

  std::string qs;
  qs  = "http://www.matrixscience.com/xmlns/schema/quantitation_1 ";
  qs += "C:/myfiles/quant_schema_1.xsd ";
  qs += "http://www.matrixscience.com/xmlns/schema/quantitation_2 ";
  qs += "../schema%20files/quantitation_2.xsd";
  setXMLschemaFilePath(XML_SCHEMA_QUANTITATION, qs.c_str());
* 

The default values used in cases where this function has not been called and no parameter is passed to the constructor are the values suitable for scripts and programs running on the Mascot Server. These values are:

XML_SCHEMA_QUANTITATION : 
    http://www.matrixscience.com/xmlns/schema/quantitation_1 ../html/xmlns/schema/quantitation_1/quantitation_1.xsd
    http://www.matrixscience.com/xmlns/schema/quantitation_2 ../html/xmlns/schema/quantitation_2/quantitation_2.xsd

XML_SCHEMA_UNIMOD :
    http://www.unimod.org/xmlns/schema/unimod_2 ../html/xmlns/schema/unimod_2/unimod_2.xsd

XML_SCHEMA_CROSSLINKING :
    http://www.matrixscience.com/xmlns/schema/crosslinking_1 ../html/xmlns/schema/crosslinking_1/crosslinking_1.xsd
Parameters
XMLschemamust be one of the valid XML_SCHEMA values
pathshould be a list of pairs "_schema_alias_ SPACE _file_path_", where SPACE is the space character. See XML_SCHEMA for the supported _scheama_alias_ values for each type of schema
Returns
true if the XMLschema value is valid

◆ staticGetPercolatorFileNames() [1/2]

bool staticGetPercolatorFileNames ( const char *  szFileName,
const char *  cacheDirectory,
const char *  percolatorFeatures,
const char *  additionalFeatures,
const bool  useRetentionTimes,
std::vector< std::string > &  filenames,
std::vector< bool > &  exists 
)
static

Returns a list of the Percolator input and output files for the specified data file (deprecated).

Deprecated:
This function is deprecated and has been replaced with matrix_science::ms_mascotresfilebase::staticGetPercolatorFileNames(const char*, const char*, const ms_mascotoptions&, const char *, const std::vector<std::string>&, std::vector<std::string>&, std::vector<bool>&).

This static function can be called without creating an ms_mascotresfilebase object, and can be used in advance of creating an object to see if the percolator files will need to be created. If an object has already been created, it is normally easier to call setPercolatorFeatures() and then getPercolatorFileNames().

See Using Percolator scores and Using STL vector classes vectori, vectord and VectorString in Perl, Java, Python and C#.

Parameters
szFileNameis the absolute or relative path to the results file.
cacheDirectorywill normally be the value returned from ms_mascotoptions::getCacheDirectory. If it's empty, then the default pattern is used (../data/cache/%Y/%d).
percolatorFeaturesis normally retrieved by calling ms_mascotoptions::getPercolatorFeatures(). The filenames encode the features so that there is no conflict.
additionalFeaturesis normally a string passed to ms-createpip.exe. For example, "-a numUniqPeps -r varmods" would add numUniqPeps and remove varmods from the default. Any other parameters except -a and -r are ignored. An empty string means nothing is added or removed from percolatorFeatures.
useRetentionTimesis a flag to indicate whether retention time information is used by percolator.exe.
filenamesreturns the list of files. The offsets are defined by ms_mascotresfilebase::PERCOLATOR_FILE_NAMES.
existsis a boolean vector which will return flags indicating if the Percolator files exists. The values correspond to the values in filenames vector.
Returns
true if all files exist

◆ staticGetPercolatorFileNames() [2/2]

bool staticGetPercolatorFileNames ( const char *  szFileName,
const char *  cacheDirectory,
const ms_mascotoptions options,
const char *  additionalFeatures,
const std::vector< std::string > &  adapterParameters,
std::vector< std::string > &  filenames,
std::vector< bool > &  exists 
)
static

Returns a list of the Percolator input and output files for the specified data file.

This static function can be called without creating an ms_mascotresfilebase object, and can be used in advance of creating an object to see if the percolator files will need to be created. If an object has already been created, it is normally easier to call setPercolatorFeatures() and then getPercolatorFileNames().

The function uses the following fields from options:

  • PercolatorFeatures (string)
  • PercolatorUseRT (true/false)
  • PercolatorUseProteins (true/false)
  • PercolatorExeFlags (string)
  • PercolatorTargetRankScoreThreshold (int)
  • PercolatorTargetRankRelativeThreshold (double)

Make sure you set PercolatorExeFlags in options based on whether the results file has any queries with a retention time. Otherwise, this method may generate a filename different from setPercolatorFeatures().

    bool anyRT = (whether any query has RTINSECONDS);
    std::string percolatorFlags = options.getPercolatorRtFlags(anyRT, options.isPercolatorUseRT());
    options.setPercolatorExeFlags(percolatorFlags.c_str());

See Using Percolator scores and Using STL vector classes vectori, vectord and VectorString in Perl, Java, Python and C#.

Parameters
szFileNameis the absolute or relative path to the results file.
cacheDirectorywill normally be the value returned from ms_mascotoptions::getCacheDirectory. If it's empty, then the default pattern is used (../data/cache/%Y/%d).
optionscontains the options stored in mascot.dat It is used to access the relevant options to generate the file names.
additionalFeaturesis normally a string passed to ms-createpip.exe. For example, "-a numUniqPeps -r varmods" would add numUniqPeps and remove varmods from the default. Any other parameters except -a and -r are ignored. An empty string means nothing is added or removed from ms_mascotoptions::getPercolatorFeatures().
adapterParametersis a vector of parameters to ML adapters introduced in Mascot Server 3.0. If the vector is not empty, then the parameters are sorted and hashed into an additional MD5 checksum component of the pip/pop file names.
filenamesreturns the list of files. The offsets are defined by ms_mascotresfilebase::PERCOLATOR_FILE_NAMES.
existsis a boolean vector which will return flags indicating if the Percolator files exists. The values correspond to the values in filenames vector.
Returns
true if all files exist

◆ versionGreaterOrEqual()

bool versionGreaterOrEqual ( int  major,
int  minor,
int  revision 
) const

Compare the value returned by getMascotVer() with the passed version number.

Utility function to perform easy comparison. For example, to test if a results file could have taxonomy information, use:

if (versionGreaterOrEqual(2, 4, 0)) then ...

Note
Versions 2.3.1 to 2.4.0 of Mascot Parser returned the incorrect value. This was fixed in 2.4.1
Parameters
majoris the major version to be compared with
minoris the minor version to be compared with
revisionis the minor revision to be compared with
Returns
Will return true if the version of the file is greater than or equal to the passed value

◆ willCreateCache() [1/2]

bool willCreateCache ( const char *  szFileName,
const ms_mascotoptions opts,
const char *  applicationName,
std::string &  resfileCacheFileName,
unsigned int &  cacheStatus 
)
static

Returns true if a cache file will be created when the ms_mascotresfile_dat constructor is called.

This static function can be called without creating an ms_mascotresfile_msr or ms_mascotresfile_dat object. It can be used in advance of creating the object to see if there will be a delay while (re)creating the cache file(s).

The purpose of this method is to get the status of the cache in addition of whether the cache will be created or not.

See Multiple return values in Perl, Java, Python and C#

See Static functions in Perl, Java, Python and C#

Parameters
[in]szFileNameis the absolute or relative path to the Fxxxxx.dat file
[in]optsnormally loaded from the mascot.dat file using ms_datfile::getMascotOptions()
[in]applicationNameis the name of the application or script that is calling this function. The applicationName is searched for in the return value from ms_mascotoptions::getResultsCache and ms_mascotoptions::getResfileCache to determine if the application should be using cache files. If it is not found then the function returns false and sets the cacheStatus to ms_peptidesummary::RESFILE_CACHE_DISABLED_IN_OPTIONS. If null, or an empty string is passed, no check is made.
[out]resfileCacheFileNamereturns the name of the ms_mascotresfilebase cache file if one exists or would be created
[out]cacheStatusis the ms_peptidesummary::CACHE_STATUS enumeration which gives more details about why the cache file may or may not be created. Multiple values may be bitwise OR'd toegether.
Returns
true if either of the cache files will be (re-)created because they does not exist or are not complete or not up to date. It will return false if the options specify that the applicationName shouldn't create or use cache files.

◆ willCreateCache() [2/2]

bool willCreateCache ( const char *  szFileName,
const unsigned int  flags,
const char *  cacheDirectory,
std::string *  cacheFileName 
)
static

Returns true if a cache file will be created when the ms_mascotresfile_dat constructor is called.

This static function can be called without creating an ms_mascotresfile_dat or ms_mascotresfile_msr object, and can be used in advance of creating an object to see if there will be a delay while (re)creating a cache file. The function has the same parameters as the ms_mascotresfilebase constructor.

The behaviour depends on the file type (ms_mascotresfilebase::resfileType()). If it is MSR, this method always returns false, because MSR files do not need a resfile cache. If it is dat28 format (.dat), a resfile cache may be needed.

See Static functions in Perl, Java, Python and C#

See Multiple return values in Perl, Java, Python and C#.

Parameters
[in]szFileName- see ms_mascotresfilebase::ms_mascotresfilebase
[in]flags- see ms_mascotresfilebase::ms_mascotresfilebase
[in]cacheDirectory- see ms_mascotresfilebase::ms_mascotresfilebase
[out]cacheFileName- the full path name of the cache file. For languages other than C++, this will be a reference rather than a pointer to a std::string.
Returns
true if the cache file will be (re-)created because it does not exist or is not complete or not up to date. If the RESFILE_USE_CACHE flag has not been specified, it will always return false

The documentation for this class was generated from the following files: