glmfit (Statistics Toolbox)

Generalized linear model fitting

Syntax

b = glmfit(X,Y,'distr')
b = glmfit(X,Y,'distr','link','estdisp',offset,pwts,'const')
[b,dev,stats] = glmfit(...)

Description

b = glmfit(x,y,'distr') fits the generalized linear model for response Y, predictor variable matrix X, and distribution 'distr'. The following distributions are available: 'binomial', 'gamma', 'inverse gaussian', 'lognormal', 'normal' (the default), and 'poisson'. In most cases Y is a vector of response measurements, but for the binomial distribution Y is a two-column array having the measured number of counts in the first column and the number of trials (the binomial N parameter) in the second column. X is a matrix having the same number of rows as Y and containing the values of the predictor variables for each observation. The output b is a vector of coefficient estimates. This syntax uses the canonical link (see below) to relate the distribution parameter to the predictors.

b = glmfit(x,y,'distr','link','estdisp',offset,pwts,'const') provides additional control over the fit. The 'link' argument specifies the relationship between the distribution parameter (µ) and the fitted linear combination of predictor variables (xb). In most cases 'link' is one of the following:

'link'
Meaning
Default (Canonical) Link

'identity'
µ = xb
'normal'

'log'
log(µ) = xb
'poisson'

'logit'
log(µ/ (1-µ)) = xb
'binomial'

'probit'
norminv(µ) = xb

'comploglog'
log(-log(1-µ)) = xb

'logloglink'
log(-log(µ)) = xb

'reciprocal'
1/µ = xb
'gamma'

p (a number)
µ^p = xb
'inverse gaussian' (with p=-2)

'link'	Meaning	Default (Canonical) Link
`'identity'`	µ = xb	`'normal'`
`'log'`	log(µ) = xb	`'poisson'`
`'logit'`	log(µ/ (1-µ)) = xb	`'binomial'`
`'probit'`	norminv(µ) = xb
`'comploglog'`	log(-log(1-µ)) = xb
`'logloglink'`	log(-log(µ)) = xb
`'reciprocal'`	1/µ = xb	`'gamma'`
`p` (a number)	µ^p = xb	`'inverse gaussian'` (with p=-2)

Alternatively, you can write functions to define your own custom link. You specify the link argument as a three-element cell array containing functions that define the link function, its derivative, and its inverse. For example, suppose you want to define a reciprocal square root link using inline functions. You could define the variable mylinks to use as your 'link' argument by writing:

FL = inline('x.^-.5')
FD = inline('-.5*x.^-1.5')
FI = inline('x.^-2')
mylinks = {FL FI FD}

Alternatively, you could define functions named FL, FD, and FI in their own M-files, and then specify mylinks in the form

```
mylinks = {@FL @FD @FI}
```

The 'estdisp' argument can be 'on' to estimate a dispersion parameter for the binomial or Poisson distribution, or 'off' (the default) to use the theoretical value of 1.0 for those distributions. The glmfit function always estimates dispersion parameters for other distributions.

The offset and pwts parameters can be vectors of the same length as Y, or can be omitted (or specified as an empty vector). The offset vector is a special predictor variable whose coefficient is known to be 1.0. As an example, suppose that you are modeling the number of defects on various surfaces, and you want to construct a model in which the expected number of defects is proportional to the surface area. You might use the number of defects as your response, along with the Poisson distribution, the log link function, and the log surface area as an offset.

The pwts argument is a vector of prior weights. As an example, if the response value Y(i) is the average of f(i) measurements, you could use f as a vector of prior weights.

The 'const' argument can be 'on' (the default) to estimate a constant term, or 'off' to omit the constant term. If you want the constant term, use this argument rather than specifying a column of ones in the X matrix.

[b,dev,stats] = glmfit(...) returns the additional outputs dev and stats. dev is the deviance at the solution vector. The deviance is a generalization of the residual sum of squares. It is possible to perform an analysis of deviance to compare several models, each a subset of the other, and to test whether the model with more terms is significantly better than the model with fewer terms.

stats is a structure with the following fields:

stats.dfe = degrees of freedom for error
stats.s = theoretical or estimated dispersion parameter
stats.sfit = estimated dispersion parameter
stats.estdisp = 1 if dispersion is estimated, 0 if fixed
stats.beta = vector of coefficient estimates (same as b)
stats.se = vector of standard errors of the coefficient estimates b
stats.coeffcorr = correlation matrix for b
stats.t = t statistics for b
stats.p = p-values for b
stats.resid = vector of residuals
stats.residp = vector of Pearson residuals
stats.residd = vector of deviance residuals
stats.resida = vector of Anscombe residuals

If you estimate a dispersion parameter for the binomial or Poisson distribution, then stats.s is set equal to stats.sfit. Also, the elements of stats.se differ by the factor stats.s from their theoretical values.

Example

We have data on cars weighing between 2100 and 4300 pounds. For each car weight we have the total number of cars of that weight, and the number that can be considered to get "poor mileage" according to some test. For example, 8 out of 21 cars weighing 3100 pounds get poor mileage according to a measurement of the miles they can travel on a gallon of gasoline.

w = (2100:200:4300)';
poor = [1 2 0 3 8 8 14 17 19 15 17 21]';
total = [48 42 31 34 31 21 23 23 21 16 17 21]';

We can compare several fits to these data. First, let's try fitting logit and probit models:

[bl,dl,sl] = glmfit(w,[poor total],'binomial');
[bp,dp,sp] = glmfit(w,[poor total],'binomial','probit');
dl
dl =
    6.4842
dp
dp =
    7.5693

The deviance for the logit model is smaller than for the probit model. Although this is not a formal test, it leads us to prefer the logit model.

We can do a formal test comparing two logit models. We already fit one model using w as a linear predictor. Let's fit another logit model using both linear and squared terms in w. If there is no true effect for the squared term, the difference in their deviances should be small compared with a chi-square distribution having one degree of freedom.

[b2,d2,s2] = glmfit([w w.^2],[poor total],'binomial')
dl-d2
ans =
    0.7027
chi2cdf(dl-d2,1)
ans =
    0.5981

A difference of 0.7072 is not at all unusual for a chi-square distribution with one degree of freedom, so the quadratic model does not give a significantly better fit than the simpler linear model.

The following are the coefficient estimates, their standard errors, t-statistics, and p-values for the linear model:

[b sl.se sl.t sl.p]
ans =
  -13.3801    1.3940   -9.5986    0.0000
    0.0042    0.0004    9.4474    0.0000

This shows that we cannot simplify the model any further. Both the intercept and slope coefficients are significantly different from 0, as indicated by p-values that are 0.0000 to four decimal places.

See Also
glmval, glmdemo, nlinfit, regress, regstats

References

[1] Dobson, A. J. An Introduction to Generalized Linear Models. 1990, CRC Press.

[2] MuCullagh, P. and J. A. Nelder. Generalized Linear Models. 2nd edition, 1990, Chapman and Hall.

glmdemo glmval