Generalized Linear Models - 1

Generalising the GLM

Link Function

Parameter Estimation

Summary

Looking Beyond
continuous outcomes

g(µ) = Xβ

GLM

Use Principle of least squares
to estimate values for β

Fitted model gives
line of best fit
determined by minimising
the error sums of squares (SSE)

Fitted values for β are called
least squares estimates

Although indiv outcome
is Y/N, often use proportion
as the outcome

In order to understand population:
Proporation
(mortality) rate
Disease ?Y/N

Health studies
look for specific outcome

g() is a function that links
E(Y) = µ to X in a linear fashion

can easily change g to
other (non-linear) functions

g() is link function

g() must be monotonic & differentiable

Binary outcome va - Binomial Dist.
Logit link function
Logistic regression

Each type of data
has a distribution &
a natural link function
(commonly used but
not absolute requirement)

eg if outcome va
is a probability, never >1,
non-linear relationship

Transform Y using
logit function

relationship is
now straight line

then use linear model

Transformation:

Need to use
Maximum Likelihood Estimation (MLE)

MLE - what is the most likely value
of the βs given the observed data?

Cannot use Least Squares Estimation
to estimate the values of β

To do the MLE:

  • link function must be
    monotonic & differentiable
  • Distribution must belong to
    exponential family

Type of link function
changes for different
types of data

Provides options for
modelling more types of
outcome data

Differ from GLM

Downside

Use a link function
(transforming Y)

Use MLE instead of
least squares estimation

Some model-specific
pseudo R-squared values

May need to use
deviance to
measure model fit

Cannot use R-squared
to measure explanatory power