Matrix Science Mascot Parser toolkit
 
Loading...
Searching...
No Matches
ms_ms1quant_time_align Class Reference

Time alignment between multiple runs (raw files) is performed in Mascot Distiller for label free quantitation. More...

#include <ms_ms1quant_time_align.hpp>

Inheritance diagram for ms_ms1quant_time_align:
Collaboration diagram for ms_ms1quant_time_align:

Public Types

typedef std::vector< std::vector< std::map< double, double > > > featureWidths_t
 [prj][m/z bin][rt->width]
 
typedef std::map< int, std::vector< int > > fractionToSubProject_t
 the vector for each fraction contains 0 based subProject numbers and fraction numbers don't need to be sequential, hence the map
 
enum  status_t {
  ST_TA_NO_DATA ,
  ST_TA_LOADED_FROM_XML ,
  ST_TA_LOADED_FROM_CDB ,
  ST_TA_CALCULATED
}
 

Public Member Functions

 ms_ms1quant_time_align (const int binSize=12)
 Default constructor.
 
 ms_ms1quant_time_align (const ms_ms1quant_time_align_body &body)
 Populated constructor.
 
void appendErrors (const ms_errors &src)
 Copies all errors from another instance and appends them at the end of own list.
 
bool calculateFromConsensuValues (bool replaceExisting=false)
 A retention time shift is calculated by the shift from project A to the consensus and then subtracting the shift from the consensus to project B. This.
 
void clearAllErrors ()
 Remove all errors from the current list of errors.
 
void copyFrom (const ms_errors *right)
 Use this member to make a copy of another instance.
 
std::string getAlgorithmName () const
 Return the name of the algorithm used for calculating the time alignment.
 
int getBinSize () const
 Return the value supplied in the constructor, or loaded from an xml or cdb file.
 
ms_ms1quant_time_align_limitsgetCombinedLimits (const int fractionNum)
 The m/z and retention time limits for each fraction are calculated from the limits for each rawfile and search results file for that fraction.
 
ms_ms1quant_time_align_limits getCombinedLimits (const int fractionNum) const
 The m/z and retention time limits for each fraction are calculated from the limits for each rawfile and search results file for that fraction.
 
const ms_errsgetErrorHandler () const
 Retrive the error object using this function to get access to all errors and error parameters.
 
double getEstimatedFeatureWidth (const int subProjectId, const double mOverZ, const double rt) const
 For label free, return the estimated width of an XIC for a given mOverZ and retention time.
 
bool getEvaluation (const int subProject, double &meanErrorsRaw, double &meanErrorsAligned, double &stdevErrorsRaw, double &stdevErrorsAligned, double &pearsonCoefficientRaw, double &pearsonCoefficientAligned)
 
const featureWidths_tgetFeatureWidths () const
 Returns the multidimentional vector that has the feature widths for each project.
 
std::vector< std::vector< std::vector< double > > > & getFinalResults ()
 
const std::vector< std::vector< std::vector< double > > > & getFinalResults () const
 
fractionToSubProject_t getFractionToSubProjectMap () const
 Return the map to enable looking up a list of sub projects for each fraction.
 
int getLastError () const
 Return the error description of the last error that occurred.
 
std::string getLastErrorString () const
 Return the error description of the last error that occurred.
 
double getRToffset (const int myProjectId, const int otherProjectId, const double rtInOtherProject, const double mOverZ) const
 Get the offset between two projects.
 
const std::vector< std::vector< std::vector< double > > > & getShiftsFromConsensus () const
 Returns the vector that has all the offsets from a consensus to a specified project.
 
const std::vector< std::vector< std::vector< double > > > & getShiftsToConsensus () const
 Returns the vector that has all the offsets from one project to a consensus.
 
status_t getStatus (void) const
 Get the status of the data.
 
std::string getStatusAsText (void) const
 Get the status of the data as text for a message.
 
const std::vector< int > getSubProjectToFractionMap () const
 Return a vector containing the fraction number that each subproject belongs to.
 
bool isValid () const
 Call this function to determine if there have been any errors.
 
bool loadXmlFile (const std::string &xmlFilename, const std::string &schemaDirectory)
 Populate the object from an XML file.
 
bool saveXmlFile (const std::string &xmlFilename, const std::string &schemaDirectory) const
 Save just the time alignment data to an XML file.
 
void setAlgorithmName (const char *algorithmName)
 
void setAverageFeatureWidths (const std::vector< double > &featureWidths)
 For supervised time alignment, a single feature width per subproject is used.
 

Protected Member Functions

void setEvaluation (int subProject, double meanErrorsRaw, double meanErrorsAligned, double stdevErrorsRaw, double stdevErrorsAligned, double pearsonCoefficientRaw, double pearsonCoefficientAligned)
 

Detailed Description

Time alignment between multiple runs (raw files) is performed in Mascot Distiller for label free quantitation.

The time alignment results are stored within a .rov file in Distiller 2.9 and later.

See: ms_ms1quantitation::getTimeAlignmentData() for how to obtain this object. Alternatively, create an empty object and then call ms_ms1quant_time_align::loadXmlFile()

For example code, see Examples for the Mascot Parser quantitation module

Member Enumeration Documentation

◆ status_t

enum status_t

Flags used to determine whether time alignment needs to be performed (again)

Enumerator
ST_TA_NO_DATA 

No time alignment data. If the protocol is replicate, this indicates time alignment still needs to be performed. Time alignment is not performed for other protocols.

ST_TA_LOADED_FROM_XML 

The time alignment data has been loaded from an xml so does not need to be recalculated.

ST_TA_LOADED_FROM_CDB 

The time alignment data has been loaded from a cdb file so does not need to be recalculated.

ST_TA_CALCULATED 

The time alignment data has been calculated and is in memory.

Constructor & Destructor Documentation

◆ ms_ms1quant_time_align()

ms_ms1quant_time_align ( const int  binSize = 12)

Default constructor.

The constructor will normally be called internally in Parser. See: ms_ms1quantitation::getTimeAlignmentData() for how to obtain this object. Alternatively, use this constructor and then call ms_ms1quant_time_align::loadXmlFile()

Parameters
binSizedefaults to 12 daltons

Member Function Documentation

◆ appendErrors()

void appendErrors ( const ms_errors src)
inherited

Copies all errors from another instance and appends them at the end of own list.

Parameters
srcThe object to copy the errors across from. See Maintaining object references: two rules of thumb.

◆ clearAllErrors()

void clearAllErrors ( )
inherited

Remove all errors from the current list of errors.

The list of 'errors' can include fatal errors, warning messages, information messages and different levels of debugging messages.

All messages are accumulated into a list in this object, until clearAllErrors() is called.

See Error Handling.

See also
isValid(), getLastError(), getLastErrorString(), getErrorHandler()
Examples
common_error.cpp, resfile_error.cpp, and resfile_summary.cpp.

◆ copyFrom()

void copyFrom ( const ms_errors right)
inherited

Use this member to make a copy of another instance.

Parameters
rightis the source to initialise from

◆ getAlgorithmName()

std::string getAlgorithmName ( ) const

Return the name of the algorithm used for calculating the time alignment.

Brief descriptive text that describes the algorithm used for the time alignment.

Currently restricted to "Unsupervised" or "Supervised"

Returns
the name of the type of algorithm used.

◆ getBinSize()

int getBinSize ( ) const

Return the value supplied in the constructor, or loaded from an xml or cdb file.

For time alignment, m/z values are binned together. This is the size of each bin. Spectra with precursors in the same 'bin' will have the same time alignment shift.

Returns
the bin size

◆ getCombinedLimits() [1/2]

ms_ms1quant_time_align_limits & getCombinedLimits ( const int  fractionNum)

The m/z and retention time limits for each fraction are calculated from the limits for each rawfile and search results file for that fraction.

Parameters
fractionNumcan be obtained by calling getSubProjectToFractionMap()
Returns
the limits for the specified fraction

◆ getCombinedLimits() [2/2]

ms_ms1quant_time_align_limits getCombinedLimits ( const int  fractionNum) const

The m/z and retention time limits for each fraction are calculated from the limits for each rawfile and search results file for that fraction.

Parameters
fractionNumcan be obtained by calling getSubProjectToFractionMap()
Returns
the limits for the specified fraction

◆ getErrorHandler()

const ms_errs * getErrorHandler ( ) const
inherited

Retrive the error object using this function to get access to all errors and error parameters.

See Error Handling.

Returns
Constant pointer to the error handler
See also
isValid(), getLastError(), getLastErrorString(), clearAllErrors(), getErrorHandler()
Examples
common_error.cpp, and http_helper_getstring.cpp.

◆ getEstimatedFeatureWidth()

double getEstimatedFeatureWidth ( const int  subProjectId,
const double  mOverZ,
const double  rt 
) const

For label free, return the estimated width of an XIC for a given mOverZ and retention time.

Used in label free quantitation (Replicate, not Average). Should only be used where a retention time is predicted based on time alignment. If a match has been found from a Mascot database search, we should be confident that we can start searching for the XIC from that retention time. If the the retention time has been predicted by the time alignment algorithm, then it isn't safe to assume that the predicted point will be in the XIC. Safer to use a range, but we need an estimate of that range, and that is what is returned by this function.

The retention time of the query in another subproject is transformed into the retention time passed to this function by calling getRToffset() , so rt is not going to be an exact value, and it's unlikely that there will be a 'feature' in this subproject at that retention time. So, this function looks for the closest retention time.

It can't be assumed that all subprojects will have the same number of bins. For example, another subproject may have a peptide with m/z 5000 but this subproject may have only found (in the crude feature finding) values up to a maximum m/z 4800. In this case, the highest bin is used.

If there are no bins for a specific subproject, or an invalid subProjectId is passed, then zero will be returned.

For Waters MS^E DIA projects, it's possible to get LOGMSG_QUANT_TIME_ALIGN_INDEX_MZ_RANGE1 debug messages for some lower mass pepetides. This is because an "average" charge will have been rounded up to the nearest integer, and matrix_science::ms_ms1quant_match_component_body::updateMoverZ() is called and therefore, the value returned by matrix_science::ms_ms1quant_match_component::getMoverz() may be lower than any of the precursor values in the 'survey' scans. If this happens, the debug message is generated, and the nearest bin is used. This should give a very close approximation to the proper value, and the debug message can safely be ignored by the client application.

Logging, warnings and errors:
ms_log::LOGMSG_QUANT_TIME_ALIGN_INDEX_MZ_RANGE1
ms_errs::ERR_MSP_MS1QUANT_NO_EST_WIDTH
ms_errs::ERR_MSP_MS1QUANT_INVALID_PROJECT_ID
Parameters
[in]subProjectIdis a 1 based sub project number
[in]mOverZis the precursor m/z value. m/z values are 'binned' so similar m/z values may return the same width
[in]rtis the predicted retention time of the ms/ms spectrum
Returns
the estimated feature width in seconds

◆ getEvaluation()

bool getEvaluation ( const int  subProject,
double &  meanErrorsRaw,
double &  meanErrorsAligned,
double &  stdevErrorsRaw,
double &  stdevErrorsAligned,
double &  pearsonCoefficientRaw,
double &  pearsonCoefficientAligned 
)

See Multiple return values in Perl, Java, Python and C#.

Use these values to assess how good the alignment is before and after the optimised shift has been found using a measure of correlation between the two datasets. The Pearson coefficient is a number between -1 and 1 where -1 is a perfect negative correlation, zero indicates no correlation and 1 is a perfect positive correlation. The correlation is a suitable test for alignment because, in quantitation tests, the datasets are likely to have some biological difference between them due to the test parameters. The correlation will still return a positive correlation even if the peak intensity for one dataset is much smaller than the other.

If the pearsonCoefficientRaw is close to zero (say -0.1 to +0.1) then the two datasets are very dissimilar. It's possible to then flag to the user that they may have loaded two completely unrelated datasets into the analysis by mistake. Also if pearsonCoefficientAligned isn't very high (say less than 0.8) the user can be warned that the alignment process hasn't worked very well and they may want to use a different method.

The correlation coefficient value can be used to determine if the time alignment operation has improved the correlation (and therefore alignment). Some threshold values can be defined to warn the user if the alignment is poor before and/or after the alignment operation, or even if two completely unrelated datasets have been supplied to the algorithm.

For each 'fraction', a consensus time alignment is calculated. For each subproject, the shift from the consensus to the subproject is calculated. The values here are for the correlation between the consensus and the project.

The 'raw' values are the intensity values from each scan and subproject or fraction. The aligned values are the intensity values after all optimised time shifts have been applied.

Parameters
[in]subProjectis the one based subproject number
[out]meanErrorsRawis the mean average of the absolute difference between the consensus and raw values for each subproject and fraction
[out]meanErrorsAlignedis the mean average of the absolute difference between the consensus and aligned values for each subproject and fraction
[out]stdevErrorsRawis the standard deviation of the difference between the consensus and raw values for each subproject and fraction
[out]stdevErrorsAlignedis the standard deviation of the difference between the consensus and aligned values for each subproject and fraction
[out]pearsonCoefficientRawis the Pearson correlation coefficient between the consensus and raw values for each subproject and fraction
[out]pearsonCoefficientAlignedis the Pearson correlation coefficient between the consensus and aligned values for each subproject and fraction
Returns
true if subProject >= 1 and <= numSubProjects and data has been loaded or calculated

◆ getFeatureWidths()

const ms_ms1quant_time_align::featureWidths_t & getFeatureWidths ( ) const

Returns the multidimentional vector that has the feature widths for each project.

To access the correct map/dictionary use the return value with indexes: [prj][bin] where

The returned map/dictionary will have a set of values mapping specific retention times to widths

Returns
a 2D vector accessed by: [prj][m/z bin] where the value will be a map (dictionary) of retention_time to width (in seconds)

◆ getFinalResults() [1/2]

std::vector< std::vector< std::vector< double > > > & getFinalResults ( )
Deprecated:
This function is deprecated and now calls the more accurately named getShiftsToConsensus()

◆ getFinalResults() [2/2]

const std::vector< std::vector< std::vector< double > > > & getFinalResults ( ) const
Deprecated:
This function is deprecated and now calls the more accurately named getShiftsToConsensus()

◆ getFractionToSubProjectMap()

ms_ms1quant_time_align::fractionToSubProject_t getFractionToSubProjectMap ( ) const

Return the map to enable looking up a list of sub projects for each fraction.

See also
getSubProjectToFractionMap() The vector for each fraction contains 0 based subProject numbers and fraction numbers don't need to be sequential, hence the use of a map (dictionary) rather than a vector
Returns
the map of fraction to subproject

◆ getLastError()

int getLastError ( ) const
inherited

Return the error description of the last error that occurred.

All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.

See Error Handling.

See also
isValid(), getLastErrorString(), clearAllErrors(), getErrorHandler()
Returns
the error number of the last error, or 0 if there have been no errors.

◆ getLastErrorString()

std::string getLastErrorString ( ) const
inherited

Return the error description of the last error that occurred.

All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.

Returns
Most recent error, warning, information or debug message

See Error Handling.

See also
isValid(), getLastError(), clearAllErrors(), getErrorHandler()
Examples
common_error.cpp, config_enzymes.cpp, config_fragrules.cpp, config_license.cpp, config_mascotdat.cpp, config_masses.cpp, config_modfile.cpp, config_procs.cpp, config_quantitation.cpp, config_taxonomy.cpp, http_helper_getstring.cpp, and tools_aahelper.cpp.

◆ getRToffset()

double getRToffset ( const int  myProjectId,
const int  otherProjectId,
const double  rtInOtherProject,
const double  mOverZ 
) const

Get the offset between two projects.

Returns the estimated retention time offset between two projects for a given m/z value. To find a predicted retention time in myProject, call this function and then add the returned offset to the rtInOtherProject

This is implemented by finding the value in the array returned by getShiftsToConsensus() for otherProjectId and adding the value in the array returned by getShiftsFromConsensus() for myProjectId.

Both projects must belong to the same fraction for this to be meaningful.

For Waters MS^E DIA projects, it's possible to get LOGMSG_QUANT_TIME_ALIGN_INDEX_MZ_RANGE debug messages for some lower mass pepetides. This is because an "average" charge will have been rounded up to the nearest integer, and matrix_science::ms_ms1quant_match_component_body::updateMoverZ() is called and therefore, the value returned by matrix_science::ms_ms1quant_match_component::getMoverz() may be lower than any of the precursor values in the 'survey' scans. If this happens, the debug message is generated, and the nearest bin is used. This should give a very close approximation to the proper value, and the debug message can safely be ignored by the client application.

Logging, warnings and errors:
ms_errs::ERR_MSP_MS1QUANT_RTOFFSET_DIFF_FRACT
ms_errs::ERR_MSP_MS1QUANT_TIME_ALIGN_INDEX
ms_log::LOGMSG_QUANT_TIME_ALIGN_INDEX_MZ_RANGE
ms_log::LOGMSG_QUANT_RT_OFFSET_LOOKUP
Parameters
myProjectIdis a zero based
otherProjectIdis a zero based
rtInOtherProjectis the retention time in the other project which is to be 'aligned' to a retention time in my project
mOverZis required because alignment is done in m/z bins
Returns
the offset

◆ getShiftsFromConsensus()

const std::vector< std::vector< std::vector< double > > > & getShiftsFromConsensus ( ) const

Returns the vector that has all the offsets from a consensus to a specified project.

Return a read-only 3D vector that should be accessed with [prj][bin][rt] where

The value in the array at those indexes is the predicted retention time shift from the 'consensus' to [prj].

To find the predicted retention time shift between project 'A' and project 'B', call getRToffset()

Returns
a 3D vector that should be accessed with [prj][bin][rt]

◆ getShiftsToConsensus()

const std::vector< std::vector< std::vector< double > > > & getShiftsToConsensus ( ) const

Returns the vector that has all the offsets from one project to a consensus.

Return a read-only 3D vector that should be accessed with [prj][bin][rt] where

The value in the array at those indexes is the predicted retention time shift from [prj] to the 'consensus'.

To find the predicted retention time shift between project 'A' and project 'B', call getRToffset()

Returns
a 3D vector that should be accessed with [prj][bin][rt]

◆ getStatus()

ms_ms1quant_time_align::status_t getStatus ( void  ) const

Get the status of the data.

When the object is initially constructed, it sets the status to ms_ms1quant_time_align::ST_TA_NO_DATA

A successful calls to loadXmlFile() will cause the flag to be set to ms_ms1quant_time_align::ST_TA_LOADED_FROM_XML and if the object is initialised from a CDB file it will be set to ms_ms1quant_time_align::ST_TA_LOADED_FROM_CDB If the data is calculated in msquantlib, the value is set to ms_ms1quant_time_align::ST_TA_CALCULATED

Returns
the current status of the object

◆ getStatusAsText()

std::string getStatusAsText ( void  ) const

Get the status of the data as text for a message.

When the object is initially constructed, it sets the status to ms_ms1quant_time_align::ST_TA_NO_DATA

A successful calls to loadXmlFile() will cause the flag to be set to ms_ms1quant_time_align::ST_TA_LOADED_FROM_XML and if the object is initialised from a CDB file it will be set to ms_ms1quant_time_align::ST_TA_LOADED_FROM_CDB If the data is calculated in msquantlib, the value is set to ms_ms1quant_time_align::ST_TA_CALCULATED

Returns
the current status of the object as text that can be used in a log message

◆ getSubProjectToFractionMap()

const std::vector< int > getSubProjectToFractionMap ( ) const

Return a vector containing the fraction number that each subproject belongs to.

See also
getFractionToSubProjectMap()
Returns
a vector of the size identical to the number of subprojects. The offset into the vector is therefore zero based, and the values are the fraction numbers for each subproject.

◆ isValid()

bool isValid ( ) const
inherited

◆ loadXmlFile()

bool loadXmlFile ( const std::string &  xmlFilename,
const std::string &  schemaDirectory 
)

Populate the object from an XML file.

This function is used to load the time alignment data as a discrete XML file.

If this function is successful, it will return true and the value returned by getStatus() will be ms_ms1quant_time_align::ST_TA_LOADED_FROM_XML

Parameters
xmlFilenameThe path and filename of the file to load
schemaDirectoryis the directory which will contain the distiller_time_align_1.xsd file
Returns
true if the file is successfully loaded. If it fails, there will be an error message in the matrix_science::ms_errs object passed to the constructor.

◆ saveXmlFile()

bool saveXmlFile ( const std::string &  xmlFilename,
const std::string &  schemaDirectory 
) const

Save just the time alignment data to an XML file.

This is function is used to load the time alignment data as a discrete XML file. The time alignment data can also be saved as part of the the ms1 based quantitation results by calling ms_ms1quantitation::saveXmlFile with the saveTimeAlignmentData parameter set to true.

Parameters
xmlFilenameThe path and filename of the file to load
schemaDirectoryis the directory which will contain the distiller_time_align_1.xsd file
Returns
true if the file is successfully saved. If it fails, there will be an error message in the matrix_science::ms_errs object passed to the constructor.

◆ setAlgorithmName()

void setAlgorithmName ( const char *  algorithmName)

Brief descriptive text that describes the algorithm used for the time alignment.

Currently restricted to "Unsupervised" or "Supervised"

Parameters
algorithmNamedescribes the type of algorithm used.

◆ setAverageFeatureWidths()

void setAverageFeatureWidths ( const std::vector< double > &  featureWidths)

For supervised time alignment, a single feature width per subproject is used.

For supervised time alignment, the feature width for a subproject is calculated by taking the average xic width for all identified peptides in the subproject. Hence, just one value per subproject.

Parameters
simpleFeatureWidthsis a 1D vector with numSubproject values in seconds

◆ setEvaluation()

void setEvaluation ( int  subProject,
double  meanErrorsRaw,
double  meanErrorsAligned,
double  stdevErrorsRaw,
double  stdevErrorsAligned,
double  pearsonCoefficientRaw,
double  pearsonCoefficientAligned 
)
protected
Parameters
[in]subProjectis the one based subproject number
[out]meanErrorsRawis the mean average of the absolute difference between the ? ? ?
[out]meanErrorsAligned
[out]stdevErrorsRaw
[out]stdevErrorsAligned
[out]pearsonCoefficientRaw
[out]pearsonCoefficientAligned

The documentation for this class was generated from the following files: