Matrix Science Mascot Parser toolkit
 
Loading...
Searching...
No Matches
ms_spectral_lib_entry Class Reference

This class is used to encapsulate a single entry, i.e. single spectrum from a NIST .msp file or SpectraST .sptxt file. More...

#include <ms_spectral_lib_entry.hpp>

Inheritance diagram for ms_spectral_lib_entry:
Collaboration diagram for ms_spectral_lib_entry:

Public Types

enum  WHAT_TO_ANNOTATE {
  ANNOTATE_NONE ,
  ANNOTATE_REPLACE_ALL ,
  ANNOTATE_REPLACE_QUESTION_MARKS ,
  ANNOTATE_REPLACE_IF_ALL_EMPTY
}
 Used by annotatePeaks to specify if existing annotation should be overwritten. More...
 

Public Member Functions

 ms_spectral_lib_entry ()
 Default constructor.
 
 ms_spectral_lib_entry (const ms_spectral_lib_entry &src)
 Copying constructor.
 
 ms_spectral_lib_entry (const std::string &entry, const std::string &fileName, const ms_spectral_lib::FILE_FORMAT format)
 Constructor.
 
 ms_spectral_lib_entry (const std::vector< std::string > &entry, const std::string &fileName, const ms_spectral_lib::FILE_FORMAT format)
 Constructor.
 
virtual ~ms_spectral_lib_entry ()
 Destructor.
 
bool annotatePeaks (const WHAT_TO_ANNOTATE whatToAnnotate, const double fragmentToleranceValue, const std::string fragmentToleranceUnit, const ms_umod_configfile *unimod=0)
 If there is no existing annotation for any of the peaks in the peak list, then try to annotate using sequence string and modifications.
 
void appendErrors (const ms_errors &src)
 Copies all errors from another instance and appends them at the end of own list.
 
void clearAllErrors ()
 Remove all errors from the current list of errors.
 
void clearAnnotation ()
 Removes all annotation from all peaks.
 
void copyFrom (const ms_errors *right)
 Use this member to make a copy of another instance.
 
void copyFrom (const ms_spectral_lib_entry *right)
 Use this member to make a copy of another instance.
 
const std::string get () const
 Returns the whole entry, in the original format.
 
int getCharge () const
 Returns the charge state.
 
std::string getComment () const
 Returns the whole comment line.
 
std::string getCommentField (const char *fieldName) const
 Returns a particular field entry from the comment line.
 
const ms_errsgetErrorHandler () const
 Retrive the error object using this function to get access to all errors and error parameters.
 
std::string getFileName () const
 Returns the filename where the entry was loaded from.
 
int getLastError () const
 Return the error description of the last error that occurred.
 
std::string getLastErrorString () const
 Return the error description of the last error that occurred.
 
std::string getLine (const char *key) const
 Returns the line for a given key.
 
int getMods (std::vector< std::string > &names, std::vector< int > &positions) const
 Returns the modifications.
 
double getMW () const
 Returns the molecular weight.
 
std::string getName () const
 Returns the Name which is normally the sequence followed by /[CHARGE].
 
int getNumPeaks () const
 Returns the number of peaks in the spectrum.
 
std::vector< std::string > getPeakList (bool convertToNISTformat=false) const
 Returns the peak list.
 
std::string getPeakListChecksum () const
 Returns an MD5 checksum for the peak list.
 
matrix_science::ms_spectral_lib_peak_list getPeakListObject () const
 Returns the peak list object.
 
double getPrecursorMZ () const
 Return the 'best available' precursor m/z value.
 
std::string getSequence () const
 Returns the sequence.
 
bool isValid () const
 Call this function to determine if there have been any errors.
 
ms_spectral_lib_entryoperator= (const ms_spectral_lib_entry &right)
 C++ style assignment operator.
 

Detailed Description

This class is used to encapsulate a single entry, i.e. single spectrum from a NIST .msp file or SpectraST .sptxt file.

Support for spectral library searches was added in Mascot Server 2.6 and Mascot Parser 2.6. Although external NIST software is used for the spectral library search itself, there is a requirement to process the msp files which are plain text.

The ms_spectral_lib_file class is used to open an msp file, and an individual entry can be returned using ms_spectral_lib_file::getEntryFromNumber

The format of the msp file is multiple lines, each with a [param][colon][space][value]. For example,

Name: AAAAAAAAAAAAAAAGAGAGAK/3
MW: 1598.861
Comment: Spec=Consensus Pep=Tryp
Num peaks: 150

followed by, in this case, 150 lines of peak data, followed by a blank line.

The format is defined here.

The SpectraST files (.sptxt) have minor differences, for example additional lines and a different format for the peak list. See ms_spectral_lib_file::saveAs() for details.

The code in this class allows additional parameter value lines in case of future extension. However, the Name: Comment: and Num peaks: parameters above are required, or an error will be generated.

See Spectral libraries for related information.

Member Enumeration Documentation

◆ WHAT_TO_ANNOTATE

Used by annotatePeaks to specify if existing annotation should be overwritten.

See Using enumerated values and static const ints in Perl, Java, Python and C#.

See also ms_spectral_lib_file::saveAs

Enumerator
ANNOTATE_NONE 

Don't perform any annotation.

ANNOTATE_REPLACE_ALL 

Replace all existing annotation.

ANNOTATE_REPLACE_QUESTION_MARKS 

Only annotate if ms_spectral_lib_peak_list::getPeakAnnotationLevel returns ms_spectral_lib_peak_list::ANNOT_LVL_QUESTION_MARKS.

ANNOTATE_REPLACE_IF_ALL_EMPTY 

Only annotate if ms_spectral_lib_peak_list::getPeakAnnotationLevel returns ms_spectral_lib_peak_list::ANNOT_LVL_NONE.

Constructor & Destructor Documentation

◆ ms_spectral_lib_entry() [1/4]

Default constructor.

An ms_spectral_lib_entry object is normally created using ms_spectral_lib_file::getEntryFromNumber

◆ ms_spectral_lib_entry() [2/4]

Copying constructor.

Parameters
srcis the source to initialise from

◆ ms_spectral_lib_entry() [3/4]

ms_spectral_lib_entry ( const std::string &  entry,
const std::string &  fileName,
const ms_spectral_lib::FILE_FORMAT  format 
)

Constructor.

An ms_spectral_lib_entry object is normally created using ms_spectral_lib_file::getEntryFromNumber

Parameters
entryis a string containing multiple lines from an msp file.
fileNameis the path of the file where the entry was read from. It is saved as a convenience, and also for error reporting.
formatis the format of the entry. Should be an enum, of the type ms_spectral_lib_file::FILE_FORMAT

◆ ms_spectral_lib_entry() [4/4]

ms_spectral_lib_entry ( const std::vector< std::string > &  entry,
const std::string &  fileName,
const ms_spectral_lib::FILE_FORMAT  format 
)

Constructor.

An ms_spectral_lib_entry object is normally created using ms_spectral_lib_file::getEntryFromNumber

Parameters
entryis a string containing multiple lines from an msp file.
fileNameis the path of the file where the entry was read from. It is saved as a convenience, and also for error reporting.
formatis the format of the entry. Should be an enum, of the type ms_spectral_lib_file::FILE_FORMAT

Member Function Documentation

◆ annotatePeaks()

bool annotatePeaks ( const WHAT_TO_ANNOTATE  whatToAnnotate,
const double  fragmentToleranceValue,
const std::string  fragmentToleranceUnit,
const ms_umod_configfile unimod = 0 
)

If there is no existing annotation for any of the peaks in the peak list, then try to annotate using sequence string and modifications.

Annotate peaks for this one entry.

For Spectral libraries on Mascot Server, the user needs to specify a tolerance in ppm or Daltons. See ms_libraryoptions::getToleranceInDa and ms_libraryoptions::getToleranceInPPM

Parameters
whatToAnnotateis used to specify whether existing annotation should be replaced.
fragmentToleranceValueis the value in the units specified, for matching to the calculated data. Only peaks within this tolerance will be annotated. Other peaks will be annotated with a "?"
fragmentToleranceUnitmust be "Da", "mmu" or "ppm".
unimodis required if the entry has any modifications that are just specified by name. Otherwise, there is no way to calculate the fragment ion masses.
Returns
true if annotation was performed.

◆ appendErrors()

void appendErrors ( const ms_errors src)
inherited

Copies all errors from another instance and appends them at the end of own list.

Parameters
srcThe object to copy the errors across from. See Maintaining object references: two rules of thumb.

◆ clearAllErrors()

void clearAllErrors ( )
inherited

Remove all errors from the current list of errors.

The list of 'errors' can include fatal errors, warning messages, information messages and different levels of debugging messages.

All messages are accumulated into a list in this object, until clearAllErrors() is called.

See Error Handling.

See also
isValid(), getLastError(), getLastErrorString(), getErrorHandler()
Examples
common_error.cpp, resfile_error.cpp, and resfile_summary.cpp.

◆ copyFrom() [1/2]

void copyFrom ( const ms_errors right)
inherited

Use this member to make a copy of another instance.

Parameters
rightis the source to initialise from

◆ copyFrom() [2/2]

void copyFrom ( const ms_spectral_lib_entry right)

Use this member to make a copy of another instance.

Parameters
rightis the source to initialise from

◆ getCharge()

int getCharge ( ) const

Returns the charge state.

Each spectrum begins with the line in the format:

Name: [peptide sequence]/[charge]

for example

Name: AAAAAAGAGPEM(O)VR/2

where each amino acid in the [peptide sequence] is represented by the usual (upper case) letter sequence and [charge] is the positive charge on the peptide. The only modification explicitly shown is oxidized methionine as M(O).

The charge is taken from the end of this line.

Returns
the charge

◆ getComment()

std::string getComment ( ) const

Returns the whole comment line.

The third line for each entry contains annotation information for the peptide, its origin and its spectrum It has the format:

Comments: field1=value [field2=value ]...

Comments are composed of a series of space delimited field=value pairs, where values may be embedded within double quotes. All field names are described in Table 3 of this document. There is one mandatory field, namely Parent=<m/z>, which is the precursor ion m/z required for searching

Returns
the whole comment line

◆ getCommentField()

std::string getCommentField ( const char *  fieldName) const

Returns a particular field entry from the comment line.

The third line for each entry contains annotation information for the peptide, its origin and its spectrum It has the format:

Comments: field1=value [field2=value ]...

Comments are composed of a series of space delimited field=value pairs, where values may be embedded within double quotes. All field names are described in Table 3 of this document. There is one mandatory field, namely Parent=<m/z>, which is the precursor ion m/z required for searching

Parameters
fieldNameshould not contain the =. For example, specify "Parent". The fieldName is case sensitive.
Returns
The value for the key. If the value is in quotes, then these are stripped. An empty string is returned if the parameter is not found.

◆ getErrorHandler()

const ms_errs * getErrorHandler ( ) const
inherited

Retrive the error object using this function to get access to all errors and error parameters.

See Error Handling.

Returns
Constant pointer to the error handler
See also
isValid(), getLastError(), getLastErrorString(), clearAllErrors(), getErrorHandler()
Examples
common_error.cpp, and http_helper_getstring.cpp.

◆ getFileName()

std::string getFileName ( ) const

Returns the filename where the entry was loaded from.

Each entry keeps a copy of the filename, which can be useful for error reporting

Returns
the library filename

◆ getLastError()

int getLastError ( ) const
inherited

Return the error description of the last error that occurred.

All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.

See Error Handling.

See also
isValid(), getLastErrorString(), clearAllErrors(), getErrorHandler()
Returns
the error number of the last error, or 0 if there have been no errors.

◆ getLastErrorString()

std::string getLastErrorString ( ) const
inherited

Return the error description of the last error that occurred.

All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.

Returns
Most recent error, warning, information or debug message

See Error Handling.

See also
isValid(), getLastError(), clearAllErrors(), getErrorHandler()
Examples
common_error.cpp, config_enzymes.cpp, config_fragrules.cpp, config_license.cpp, config_mascotdat.cpp, config_masses.cpp, config_modfile.cpp, config_procs.cpp, config_quantitation.cpp, config_taxonomy.cpp, http_helper_getstring.cpp, and tools_aahelper.cpp.

◆ getMods()

int getMods ( std::vector< std::string > &  names,
std::vector< int > &  positions 
) const

Returns the modifications.

Peptide modifications are specified in the comment line (see getComment()) and are given in the following format:

Mods=#/n,aa,tag/n,aa,tag...

Where

  • # is the number of modifications, with modifications separated by a forward slash "/" . Hence, Mods=0 denotes no modifications. Modifications are arranged in order of amino acid position and, if multiple modifications occur at a single position, they are arranged alphabetically by tag.
  • n is the position of the substituted amino acid, starting from 0
  • aa is the modified amino acid symbol
  • tag is the name of the modification. Ideally, this is the name defined in Unimod, but some library files may use a different convention.

For example:

Mods=2/0,A,Acetyl/8,M,Oxidation

Or files from NIST after 2017 had a different format:

Mods=1(7,C,CAM)
Mods=2(19,C,CAM)(20,C,CAM)

Or files generated by certain versions of Mascot Server:

Mods="2/0,G,Succinyl/11,K,IMID_2H(4)_NL (C-term K)"
Mods="3/6,N,2-succinyl (C)/10,N,2-succinyl (C)/12,Q,2-succinyl (C)"

See Using STL vector classes vectori, vectord and VectorString in Perl, Java, Python and C#

Parameters
[out]namesare the names of the modifications.
[out]positionsis zero based positions of the modifications.
Returns
the number of modifications.

◆ getMW()

double getMW ( ) const

Returns the molecular weight.

Taken from

MW: [exact molar mass of the peptide ion]

This is used for compatibility with electron-ionization spectra and is not used in the peptide library search; this line may be omitted.

Returns
the molecular weight

◆ getName()

std::string getName ( ) const

Returns the Name which is normally the sequence followed by /[CHARGE].

Each spectrum begins with the line in the format:

Name: [peptide sequence]/[charge]

for example

Name: AAAAAAGAGPEM(O)VR/2

where each amino acid in the [peptide sequence] is represented by the usual (upper case) letter sequence and [charge] is the positive charge on the peptide. The only modification explicitly shown is oxidized methionine as M(O).

For SpectraST files, a modification delta can be shown, for example:

Name: AAADALS[167]DLEIKDSK/2
Name: K.AADQADESSPLLS[167]PS[167]NSNHPSEHPQQDLNNK.S/4
Returns
name of the spectrum

◆ getNumPeaks()

int getNumPeaks ( ) const

Returns the number of peaks in the spectrum.

To get the peak list itself, call getPeakList

Returns
the number of peaks in the spectrum

◆ getPeakList()

std::vector< std::string > getPeakList ( bool  convertToNISTformat = false) const

Returns the peak list.

The format of the peak list is described in detail in this document.

Each peak is represented as a line divided into three tab separated fields:

[m/z][tab][relative abundance][tab]"[peak annotation(s)]"

For example:

196.1 41 "b3-18/-0.01 60/90 0.2"

For a SpectraST file, the format is slightly different:

311.0524 341.7 b4-18/-0.08 83/60 0.0981|0.54
325.1383 302.1 ? 67/60 0.0788|0.56

See also getPeakListObject()

MSPepsearch.exe does not return any results if there is no annotation on any peaks. If the input file had no annotation, and convertToNISTformat is true, then each peak is annotated with a "?"

For finer control over the format of the peak list, call getPeakListObject and then ms_spectral_lib_peak_list::asText

Parameters
convertToNISTformatdetermines the format. If this flag is false, then the string returned will be the same format as the text in the .msp or .sptxt file. If 'true', then the format is converted to the NIST format as described above.
Returns
the peak list in text format. See Using STL vector classes vectori, vectord and VectorString in Perl, Java, Python and C#

◆ getPeakListChecksum()

std::string getPeakListChecksum ( ) const

Returns an MD5 checksum for the peak list.

All spaces, tabs, newlines, carriage returns and quotation marks are removed before calculating the checksum

This is the checksum of the original 'file' entry before any annotations have been added, or format changes made.

For example: aj5r7ovt6b6q4wr7btazzd4s6i

Returns
the checksum as a string

◆ getPeakListObject()

ms_spectral_lib_peak_list getPeakListObject ( ) const

Returns the peak list object.

Returns
the peak list as a ms_spectral_lib_peak_list

◆ getPrecursorMZ()

double getPrecursorMZ ( ) const

Return the 'best available' precursor m/z value.

Returns
the "Mz_exact" value in the comment field if that exist, otherwise return the "Parent" value in the comment field. If neither are present, return 0.0

◆ getSequence()

std::string getSequence ( ) const

Returns the sequence.

Each spectrum begins with the line in the format:

Name: [peptide sequence]/[charge]

for example

Name: AAAAAAGAGPEM(O)VR/2

where each amino acid in the [peptide sequence] is represented by the usual (upper case) letter sequence and [charge] is the positive charge on the peptide. The only modification explicitly shown is oxidized methionine as M(O).

For SpectraST files, a modification delta can be shown, for example:

Name: AAADALS[167]DLEIKDSK/2
Name: K.AADQADESSPLLS[167]PS[167]NSNHPSEHPQQDLNNK.S/4

The charge is stripped from the end of this line, and all (O) and other mods are removed.

Returns
the sequence

◆ isValid()

bool isValid ( ) const
inherited

◆ operator=()

ms_spectral_lib_entry & operator= ( const ms_spectral_lib_entry right)

C++ style assignment operator.

Parameters
rightis the source to initialise from
Returns
reference to the current object

The documentation for this class was generated from the following files: