#include <ms_imputation.hpp>
Public Member Functions | |
void | appendErrors (const ms_errors &src) |
Copies all errors from another instance and appends them at the end of own list. | |
void | clearAllErrors () |
Remove all errors from the current list of errors. | |
void | copyFrom (const ms_errors *right) |
Use this member to make a copy of another instance. | |
std::vector< std::vector< ms_imputation_missing_val > > & | getDataWithMissing () |
Get the data array that the imputation method will impute. | |
const ms_errs * | getErrorHandler () const |
Retrive the error object using this function to get access to all errors and error parameters. | |
int | getLastError () const |
Return the error description of the last error that occurred. | |
std::string | getLastErrorString () const |
Return the error description of the last error that occurred. | |
std::vector< std::vector< double > > | impute () override |
Returns a 2D array of double with the miss forest prediction results in place of missing values. | |
bool | isValid () const |
Call this function to determine if there have been any errors. | |
Protected Member Functions | |
void | averageFindAverage () |
Find average of each column. | |
std::vector< std::vector< double > > | changems_imputation_missing_valArrayToDoubleArray (std::vector< std::vector< ms_imputation_missing_val > > dataProcess) |
Extracts only the values from a 2d array of missing value objects. | |
std::vector< double > | changems_imputation_missing_valVecToDoubleVec (std::vector< ms_imputation_missing_val > missingObservation) |
Extracts only the values from a vector of missing value objects. | |
std::vector< double > | getKnownValues (std::vector< ms_imputation_missing_val >) |
Loop over one observation and get not missing values. | |
std::vector< int > | getMissingIndexes (std::vector< ms_imputation_missing_val >) |
Loop over one observation and get missing indexes. | |
std::vector< std::vector< int > > | removeDuplicateIndexes (std::vector< std::vector< int > > duplicateIndexes) |
The same combination of missing indexes can appear multiple times within a dataset. Duplicate indexes are removed from the missing indexes list to avoid making repeated alglib models. | |
void | setDataWithMissing (const std::vector< std::vector< ms_imputation_missing_val > > &dataWithMissingIn) |
An accurate prediction algorithm that uses an iterative random forest approach to create progressively improving predictions.
The original paper can be found here: https://academic.oup.com/bioinformatics/article/28/1/112/219101 While slower to complete than KNN, Miss Forest has proven to be more accurate than KNN during in house testing and within the paper. Unlike KNN, Miss Forest does not need any complete observations to function. Miss Forest uses average value imputation as a starting point and as a means of filling out all missing observations. Although to missing values are no longer missing, the positions of the original missing values is saved and used throughout. Then Miss Forest creates a random forest from the average imputed data for each variable with original missing values. Each original missing value is predicted and updated within the working dataset. This process is repeated and the two products are compared with the difference recorded. The forest creation, prediction, update and comparsion is repeated until the difference between consecutive products increases in size. The final prediction is the previous product before the increase in differences.
For example the array of five observations of five variables with some missing observations marked with a "x":
0.0 1.0 2.0 3.0 4.0 0.3 1.3 2.3 3.3 4.3 0.1 1.1 2.1 3.1 4.1 x 1.2 x 3.2 x -0.1 x 1.9 x 3.9
Is imputed to :
0.0 1.0 2.0 3.0 4.0 0.3 1.3 2.3 3.3 4.3 0.1 1.1 2.1 3.1 4.1 0.16 1.2 2.16 3.2 4.15 -0.1 1.0 1.9 3.0 3.9
This example requires 6 iterations.
Miss Forest imputation class inherits general imputation method class with average value specific properties and methods To use missforest impution, create an instance of the ms_imputation_missforest class, then an instance of ms_imputation using ms_imputation_missforest in the constructor, finally call ms_imputation.impute(). For example, with a valid ms_ms1quantitation object, in C#: ms_imputation_missforest missForestImputation = new ms_imputation_missforest(); ms_imputation Imputation = new ms_imputation(ms1Quant, missForestImputation, IMPUTATION_VARIABLE.IMPUTE_PEPTIDE_RATIO); VecVecdouble imputationRes = Imputation.impute()
|
inherited |
Copies all errors from another instance and appends them at the end of own list.
src | The object to copy the errors across from. See Maintaining object references: two rules of thumb. |
|
inherited |
Remove all errors from the current list of errors.
The list of 'errors' can include fatal errors, warning messages, information messages and different levels of debugging messages.
All messages are accumulated into a list in this object, until clearAllErrors() is called.
See Error Handling.
|
inherited |
Use this member to make a copy of another instance.
right | is the source to initialise from |
|
inherited |
Retrive the error object using this function to get access to all errors and error parameters.
See Error Handling.
|
inherited |
Return the error description of the last error that occurred.
All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.
See Error Handling.
|
inherited |
Return the error description of the last error that occurred.
All errors are accumulated into a list in this object, until clearAllErrors() is called. This function returns the last error that occurred.
See Error Handling.
|
overridevirtual |
Returns a 2D array of double with the miss forest prediction results in place of missing values.
An accurate prediction algorithm that uses an iterative random forest approach to create progressively improving predictions. The oriinal paper can be found here: https://academic.oup.com/bioinformatics/article/28/1/112/219101 While slower to complete than KNN, Miss Forest has proven to be more accurate than KNN during in house testing and within the paper. Unlike KNN, Miss Forest does not need any complete observations to function. Miss Forest uses average value imputation as a starting point and as a means of filling out all missing observations. Although to missing values are no longer missing, the positions of the original missing values is saved and used throughout. Then Miss Forest creates a random forest from the average imputed data for each variable with original missing values. Each original missing value is predicted and updated within the working dataset. This process is repeated and the two products are compared with the difference recorded. The forest creation, prediction, update and comparsion is repeated until the difference between consecutive products increases in size. The final prediction is the previous product before the increase in differences.
For example the array of five observations of five variables with some missing observations marked with a "x":
0.0 1.0 2.0 3.0 4.0 0.3 1.3 2.3 3.3 4.3 0.1 1.1 2.1 3.1 4.1 x 1.2 x 3.2 x -0.1 x 1.9 x 3.9
Is imputed to :
0.0 1.0 2.0 3.0 4.0 0.3 1.3 2.3 3.3 4.3 0.1 1.1 2.1 3.1 4.1 0.16 1.2 2.16 3.2 4.15 -0.1 1.0 1.9 3.0 3.9
This example requires 6 iterations.
Implements ms_imputation_method.
|
inherited |
Call this function to determine if there have been any errors.
This will return true unless there have been any fatal errors.
See Error Handling.
|
protectedinherited |
Set the missing value array
dataWithMissingIn | The array of missing values to be imputed |