Generalized Linear Models - 1
Generalising the GLM
Link Function
Parameter Estimation
Summary
Looking Beyond
continuous outcomes
g(µ) = Xβ
GLM
Use Principle of least squares
to estimate values for β
Fitted model gives
line of best fit
determined by minimising
the error sums of squares (SSE)
Fitted values for β are called
least squares estimates
Although indiv outcome
is Y/N, often use proportion
as the outcome
In order to understand population:
Proporation
(mortality) rate
Disease ?Y/N
Health studies
look for specific outcome
g() is a function that links
E(Y) = µ to X in a linear fashion
can easily change g to
other (non-linear) functions
g() is link function
g() must be monotonic & differentiable
Binary outcome va - Binomial Dist.
Logit link function
Logistic regression
Each type of data
has a distribution &
a natural link function
(commonly used but
not absolute requirement)
eg if outcome va
is a probability, never >1,
non-linear relationship
Transform Y using
logit function
relationship is
now straight line
then use linear model
Transformation:
Need to use
Maximum Likelihood Estimation (MLE)
MLE - what is the most likely value
of the βs given the observed data?
Cannot use Least Squares Estimation
to estimate the values of β
To do the MLE:
- link function must be
monotonic & differentiable - Distribution must belong to
exponential family
Type of link function
changes for different
types of data
Provides options for
modelling more types of
outcome data
Differ from GLM
Downside
Use a link function
(transforming Y)
Use MLE instead of
least squares estimation
Some model-specific
pseudo R-squared values
May need to use
deviance to
measure model fit
Cannot use R-squared
to measure explanatory power