Matrix Science Mascot Parser toolkit
 
Loading...
Searching...
No Matches
ms_imputation_knn Class Reference

#include <ms_imputation.hpp>

Inheritance diagram for ms_imputation_knn:
Collaboration diagram for ms_imputation_knn:

Public Member Functions

void appendErrors (const ms_errors &src)
 Copies all errors from another instance and appends them at the end of own list.
 
void clearAllErrors ()
 Remove all errors from the current list of errors.
 
void copyFrom (const ms_errors *right)
 Use this member to make a copy of another instance.
 
std::vector< std::vector< ms_imputation_missing_val > > & getDataWithMissing ()
 Get the data array that the imputation method will impute.
 
const ms_errsgetErrorHandler () const
 Retrive the error object using this function to get access to all errors and error parameters.
 
int getKnnNumNeighbours ()
 Get the number of nearest neighbours used when calculating an imputed value. Default value = 5.
 
bool getKnnUseWeightedAverage ()
 Get the bool to signify if a weighted average of nearest neighbours is used when calculating an imputed value. Default case = true.
 
int getLastError () const
 Return the error description of the last error that occurred.
 
std::string getLastErrorString () const
 Return the error description of the last error that occurred.
 
std::vector< std::vector< double > > impute () override
 Returns a 2D array of doubles with the calculated nearest neighbour average value in place of missing values.
 
bool isValid () const
 Call this function to determine if there have been any errors.
 
void setKnnNumNeighbours (int knnNumNeighboursIn)
 Set the number of nearest neighbours used when calculating the imputed value. Default value = 5.
 
void setKnnUseWeightedAverage (bool knnUseWeightedAverageIn)
 Set the bool to signify if a weighted average of nearest neighbours is used when calculating an imputed value. Default case = true.
 

Protected Member Functions

void averageFindAverage ()
 Find average of each column.
 
std::vector< std::vector< double > > changems_imputation_missing_valArrayToDoubleArray (std::vector< std::vector< ms_imputation_missing_val > > dataProcess)
 Extracts only the values from a 2d array of missing value objects.
 
std::vector< double > changems_imputation_missing_valVecToDoubleVec (std::vector< ms_imputation_missing_val > missingObservation)
 Extracts only the values from a vector of missing value objects.
 
std::vector< double > getKnownValues (std::vector< ms_imputation_missing_val >)
 Loop over one observation and get not missing values.
 
std::vector< int > getMissingIndexes (std::vector< ms_imputation_missing_val >)
 Loop over one observation and get missing indexes.
 
std::vector< std::vector< int > > removeDuplicateIndexes (std::vector< std::vector< int > > duplicateIndexes)
 The same combination of missing indexes can appear multiple times within a dataset. Duplicate indexes are removed from the missing indexes list to avoid making repeated alglib models.
 
void setDataWithMissing (const std::vector< std::vector< ms_imputation_missing_val > > &dataWithMissingIn)
 

Detailed Description

Knn imputation class inherits general imputation method class with average value specific properties and methods

EXPERIMENTAL
This class is EXPERIMENTAL. Both the API and the implementation may change in a future version of Parser.

A fast and reliable method of imputation that makes more accurate predictions than average value. KNN finds the most similar complete observations to each observation with missing then calculate the average of the closest k variables. KNN requires that a subset of observations contain no missing values. This subset becomes the neighbour group that all predictions are calculated from. KNN has two configurable settings that will effect the prediction value set via the get/set structure. knnNumNearest sets the number of neighbours that are included when the prediction is calculated. knnUseWeightedMean sets whether the calculated average of the neighbours is weighted by the cloesness of the neighbour

For example the array of five observations of five variables with some missing observations marked with a "x":

0.0 1.0 2.0 3.0 4.0
0.3 1.3 2.3 3.3 4.3
0.1 1.1 2.1 3.1 4.1
x 1.2 x 3.2 x
-0.1 x 1.9 x 3.9

Is imputed to :

0.0 1.0 2.0 3.0 4.0
0.3 1.3 2.3 3.3 4.3
0.1 1.1 2.1 3.1 4.1
0.2 1.2 2.2 3.2 4.2
-0.1 1.05 1.9 3.05 3.9

with the parameters knnNumNearest = 2 and knnUseWeightedMean = False.

K-Nearest Neighbours imputation class inherits general imputation method class with average value specific properties and methods To use knn impution, create an instance of the ms_imputation_knn class, set the knn properties you desire, then create an instance of ms_imputation using ms_imputation_knn in the constructor, finally call ms_imputation.impute(). For example, with a valid ms_ms1quantitation object, in C#: ms_imputation_knn knnImputation = new ms_imputation_knn(); knnImputation.setKnnNumNeighbours(5); knnImputation.setKnnUseWeightedAverage(true); ms_imputation Imputation = new ms_imputation(ms1Quant, ms_imputation_knn, IMPUTATION_VARIABLE.IMPUTE_PEPTIDE_RATIO); VecVecdouble imputationRes = Imputation.impute()

Member Function Documentation

◆ appendErrors()

void appendErrors ( const ms_errors src)
inherited

Copies all errors from another instance and appends them at the end of own list.

Parameters
srcThe object to copy the errors across from. See Maintaining object references: two rules of thumb.

◆ clearAllErrors()

void clearAllErrors ( )
inherited

Remove all errors from the current list of errors.

The list of 'errors' can include fatal errors, warning messages, information messages and different levels of debugging messages.

All messages are accumulated into a list in this object, until clearAllErrors() is called.

See Error Handling.

See also
isValid(), getLastError(), getLastErrorString(), getErrorHandler()
Examples
common_error.cpp, resfile_error.cpp, and resfile_summary.cpp.

◆ copyFrom()

void copyFrom ( const ms_errors right)
inherited

Use this member to make a copy of another instance.

Parameters
rightis the source to initialise from

◆ getErrorHandler()

const ms_errs * getErrorHandler ( ) const
inherited

Retrive the error object using this function to get access to all errors and error parameters.

See Error Handling.

Returns
Constant pointer to the error handler
See also
isValid(), getLastError(), getLastErrorString(), clearAllErrors(), getErrorHandler()
Examples
common_error.cpp, and http_helper_getstring.cpp.

◆ getLastError()

int getLastError ( ) const
inherited

Return the error description of the last error that occurred.

All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.

See Error Handling.

See also
isValid(), getLastErrorString(), clearAllErrors(), getErrorHandler()
Returns
the error number of the last error, or 0 if there have been no errors.

◆ getLastErrorString()

std::string getLastErrorString ( ) const
inherited

Return the error description of the last error that occurred.

All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.

Returns
Most recent error, warning, information or debug message

See Error Handling.

See also
isValid(), getLastError(), clearAllErrors(), getErrorHandler()
Examples
common_error.cpp, config_enzymes.cpp, config_fragrules.cpp, config_license.cpp, config_mascotdat.cpp, config_masses.cpp, config_modfile.cpp, config_procs.cpp, config_quantitation.cpp, config_taxonomy.cpp, http_helper_getstring.cpp, and tools_aahelper.cpp.

◆ impute()

std::vector< std::vector< double > > impute ( )
overridevirtual

Returns a 2D array of doubles with the calculated nearest neighbour average value in place of missing values.

A fast and reliable method of imputation that makes more accurate predictions than average value. KNN finds the most similar complete observations to each observation with missing then calculate the average of the closest k variables. KNN requires that a subset of observations contain no missing values. This subset becomes the neighbour group that all predictions are calculated from. KNN has two configurable settings that will effect the prediction value set via the get/set structure. knnNumNearest sets the number of neighbours that are included when the prediction is calculated. knnUseWeightedMean sets whether the calculated average of the neighbours is weighted by the cloesness of the neighbour

For example the array of five observations of five variables with some missing observations marked with a "x":

0.0 1.0 2.0 3.0 4.0
0.3 1.3 2.3 3.3 4.3
0.1 1.1 2.1 3.1 4.1
x 1.2 x 3.2 x
-0.1 x 1.9 x 3.9

Is imputed to :

0.0 1.0 2.0 3.0 4.0
0.3 1.3 2.3 3.3 4.3
0.1 1.1 2.1 3.1 4.1
0.2 1.2 2.2 3.2 4.2
-0.1 1.05 1.9 3.05 3.9

with the parameters knnNumNearest = 2 and knnUseWeightedMean = False.

Returns
A 2D array with the nearest neighbour predictions in place of missing observations

Implements ms_imputation_method.

◆ isValid()

bool isValid ( ) const
inherited

◆ setDataWithMissing()

void setDataWithMissing ( const std::vector< std::vector< ms_imputation_missing_val > > &  dataWithMissingIn)
protectedinherited

Set the missing value array

Parameters
dataWithMissingInThe array of missing values to be imputed

The documentation for this class was generated from the following files: