Statistics Toolbox    
lillietest

Lilliefors test for goodness of fit to a normal distribution

Syntax

Description

H = lillietest(X) performs the Lilliefors test on the input data vector X and returns H, the result of the hypothesis test. The result H is 1 if we can reject the hypothesis that X has a normal distribution, or 0 if we cannot reject that hypothesis. We reject the hypothesis if the test is significant at the 5% level.

The Lilliefors test evaluates the hypothesis that X has a normal distribution with unspecified mean and variance, against the alternative that X does not have a normal distribution. This test compares the empirical distribution of X with a normal distribution having the same mean and variance as X. It is similar to the Kolmogorov-Smirnov test, but it adjusts for the fact that the parameters of the normal distribution are estimated from X rather than specified in advance.

H = lillietest(X,alpha) performs the Lilliefors test at the 100*alpha% level rather than the 5% level. alpha must be between 0.01 and 0.2.

[H,P,LSTAT,CV] = lillietest(X,alpha) returns three additional outputs. P is the p-value of the test, obtained by linear interpolation in a set of table created by Lilliefors. LSTAT is the value of the test statistic. CV is the critical value for determining whether to reject the null hypothesis. If the value of LSTAT is outside the range of the Lilliefors table, P is returned as NaN but H indicates whether to reject the hypothesis.

Example

Do car weights follow a normal distribution? Not exactly, because weights are always positive, and a normal distribution allows both positive and negative values. However, perhaps the normal distribution is a reasonable approximation.

The Lilliefors test statistic of 0.10317 is larger than the cutoff value of 0.0886 for a 5% level test, so we reject the hypothesis of normality. In fact, the p-value of this test is approximately 0.02.

To visualize the distribution, we can make a histogram. This graph shows that the distribution is skewed to the right - from the peak near 2250, the frequencies drop off abruptly to the left but more gradually to the right.

Sometimes it is possible to transform a variable to make its distribution more nearly normal. A log transformation, in particular, tends to compensate for skewness to the right.

Now the p-value is approximately 0.13, so we do not reject the hypothesis.

Reference

[1]  Conover, W. J. (1980). Practical Nonparametric Statistics. New York, Wiley.

See Also
hist, jbtest, kstest2


  lhsnorm linkage