The Shapiro-Wilk W test. More...

#include <ms_shapiro_wilk.hpp>

Public Member Functions
	ms_shapiro_wilk ()

	ms_shapiro_wilk (std::deque< std::pair< size_t, double > > x, long n, long n1, long n2)
	The Shapiro-Wilk W test.

void	appendSampleValue (double y)
	Add a new sample value to the list to be tested.

void	calculate (long n, long n1, long n2)
	Calculate results using values previously added using appendSampleValue().

void	clearSampleValues ()
	Clear current vector of X values.

double	getErrorCode () const
	Returns the error code for the Shapiro-Wilks W-statistic.

double	getPValue () const
	Returns the P-value for the Shapiro-Wilks W-statistic.

double	getResult () const
	Returns the Shapiro-Wilks W-statistic.

Static Public Member Functions
static void	swilk (bool init, std::deque< std::pair< size_t, double > > x, long n, long n1, long n2, std::deque< double > &a, double &w, double &pw, int &ifault)
	AS R94: Calculate the Shapiro-Wilk W test statistic and p-value directly.

Detailed Description

The Shapiro-Wilk W test.

Testing for normality

Testing for outliers and reporting a standard deviation for the protein ratio can only be performed if the peptide ratios are consistent with a sample from a normal distribution (in log space). If the peptide ratios do not appear to be from a normal distribution, this may indicate that the values are meaningless, and something went systematically wrong with the the analysis. On the other hand, it may indicate something interesting, like the peptides have been mis-assigned and actually come from two proteins with very different ratios, so that the distribution is bimodal. Interpretation of test success or failure must be done on a case by case basis.

Shapiro-Wilk W test

In the Shapiro-Wilk W test, the null hypothesis is that the sample is taken from a normal distribution. This hypothesis is rejected if the critical value P for the test statistic W is less than 0.05. The routine used is valid for sample sizes between 3 and 2000.

Source code for the Shapiro-Wilk W test algorithm

References:

Royston, J. P. (1982): An Extension of Shapiro and Wilk's W Test for Normality to Large Samples. Journal of the Royal Statistical Society Series C (Applied Statistics) 31(2):115-124.
Royston, J. P. (1982): Algorithm AS 181: The W Test for Normality. Journal of the Royal Statistical Society Series C (Applied Statistics) 31(2):176-180.
Royston, P. (1995): Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Journal of the Royal Statistical Society Series C (Applied Statistics) 44(4):547-551.
Algorithms AS R94, AS 66 and AS 241 from StatLib: http://lib.stat.cmu.edu/

Constructor & Destructor Documentation

◆ ms_shapiro_wilk() [1/2]

ms_shapiro_wilk ( )

Default constructor.

◆ ms_shapiro_wilk() [2/2]

ms_shapiro_wilk	(	std::deque< std::pair< size_t, double > >	x,
		long	n,
		long	n1,
		long	n2
	)

The Shapiro-Wilk W test.

Note: This constructor can only be called from C++. For other languages, use the appendSampleValue() and calculate() functions.

After calling this constructor, check for any error using getErrorCode() and then check that getPValue() returns a value greater than 0.05 to determine if it is a normal distribution.

Parameters

x	is a list of the sample values in increasing order. The first (optional) item in the pair is typically used for an index. This value is not used by the algorithm, but provided as a convenience. The second item in the pair is the sample value.
n	is the total sample size (including any right-censored values).
n1	is the number of uncensored cases (n1 <= n).
n2	is the integer part of n/2.

Member Function Documentation

◆ appendSampleValue()

void appendSampleValue ( double x )

Add a new sample value to the list to be tested.

Add a new sample value to the list to be tested when calling calculate(). Sample values must be added in increasing order.

Parameters

x	is the sample value to be added to the list.

◆ calculate()

void calculate	(	long	n,
		long	n1,
		long	n2
	)

Calculate results using values previously added using appendSampleValue().

◆ clearSampleValues()

void clearSampleValues ( )

Clear current vector of X values.

If multiple calls to calculate() are to be made for different sample values, then call this function before calling appendSampleValue().

◆ getErrorCode()

double getErrorCode ( ) const

Returns the error code for the Shapiro-Wilks W-statistic.

Possible error codes are:

0 for no error
1 if n1 < 3
2 if n > 5000 (a non-fatal error, but the accuracy of the p-value is not guaranteed in this case)
3 if n2 < n/2
4 if n1 > n or (n1 < n and n < 20).
5 if the proportion censored (n - n1)/n > 0.8.
6 if the data have zero range.
7 if the sample values are not sorted in increasing order
8 if error return from ppnd7 (which should never occur in normal operation)

Returns: error code

◆ swilk()

void swilk	(	bool	init,
		std::deque< std::pair< size_t, double > >	x,
		long	n,
		long	n1,
		long	n2,
		std::deque< double > &	a,
		double &	w,
		double &	pw,
		int &	ifault
	)

static

AS R94: Calculate the Shapiro-Wilk W test statistic and p-value directly.

Translated to C++ from the F77 version of AS R94 in StatLib.

Royston, P. (1995): Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality. Journal of the Royal Statistical Society Series C (Applied Statistics) 44(4):547-551.

In the F77 version of this function, the w parameter could be used to alter the function's behaviour. That feature has not been retained here.

Parameters

[in]	init	If false, initialise the scratch vector a.
[in]	x	Sample values sorted in increasing order.
[in]	n	Total sample size (usually `x.size()` cast to long).
[in]	n1	Sample size less censored cases (n1 <= n; often `n1 = x.size()`).
[in]	n2	`(long)(n/2)`
[in,out]	a	Scratch vector used by the algorithm.
[out]	w	The Shapiro-Wilk W statistic calculated from the data.
[out]	pw	The P-value of the statistic under the null hypothesis.
[out]	ifault	Error code, documented in getErrorCode(). If 0 or 2, then both w and pw were calculated. Otherwise an error occurred.

The documentation for this class was generated from the following files:

ms_shapiro_wilk.hpp
ms_shapiro_wilk.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Testing for normality

Shapiro-Wilk W test

Constructor & Destructor Documentation

◆ ms_shapiro_wilk() [1/2]

◆ ms_shapiro_wilk() [2/2]

Member Function Documentation

◆ appendSampleValue()

◆ calculate()

◆ clearSampleValues()

◆ getErrorCode()

◆ swilk()