We’ve been working with models like
And testing “significance” with a likelihood ratio test (LRT)
But there are other approaches, perhaps some that seem more familiar
The summary function gives us a lot of info (scroll down)
Call:
glm.nb(formula = nspp ~ log(Plot_Area), data = dat_plt, init.theta = 9.014755177,
link = log)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.75853 0.33458 -5.256 1.47e-07 ***
log(Plot_Area) 0.48172 0.05051 9.537 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for Negative Binomial(9.0148) family taken to be 1)
Null deviance: 622.95 on 527 degrees of freedom
Residual deviance: 515.85 on 526 degrees of freedom
AIC: 2331.3
Number of Fisher Scoring iterations: 1
Theta: 9.01
Std. Err.: 1.76
2 x log-likelihood: -2325.269
Some of this info in fact directly relates to the LRT
Null deviance: 622.95 on 527 degrees of freedom
Residual deviance: 515.85 on 526 degrees of freedom
(527 d.f.) - (526 d.f.) = 1 d.f., the right number if we were comparing models
The LRT statistic for those two models is
\[ \begin{aligned} \Lambda &= 2 (\log\mathscr{L}_\text{full} - \log\mathscr{L}_\text{reduced}) \\ &= 98.88 \end{aligned} \]
which is (approx.) equal to 622.95 - 515.85
(approx. because printed summary for glm.nb is a little strange)
What’s going on?
The deviance is an attempt to generalize the idea of “sum of squares”
\[ \begin{aligned} \text{Null deviance} &= 2 (\log\mathscr{L}_\text{saturated} - \log\mathscr{L}_\text{reduced}) \\ \text{Residual deviance} &= 2 (\log\mathscr{L}_\text{saturated} - \log\mathscr{L}_\text{full}) \end{aligned} \] Where \(\log\mathscr{L}_\text{saturated}\) is the log likelihood of a model where every data point gets a parameter (i.e. its the highest possible log likelihood)
What’s going on?
Null deviance
Residual deviance
corresponds to “total sum of squares” in a least squares world
corresponds to “residual sum of squares”
In least squares world we do an \(F\)-test (aka ANOVA) which compares residual and total sums of squares
In likelihood world we do an LRT (perhaps with function anova) which compares residual and null deviances
The only difference between a GLM and LM is an LM assumes normality
It is a mathematical novelty that under normality,
least squares
minimize sum of squares residual
\(P\)-value of SumSq ANOVA
\(\Longleftrightarrow\)
\(\Longleftrightarrow\)
\(=\)
maximum likelihood
maximize log likelihood function
\(P\)-value of LRT
The summary function also gave \(P\)-values
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.75853 0.33458 -5.256 1.47e-07 ***
log(Plot_Area) 0.48172 0.05051 9.537 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
What are those about?
Recall for a mixed effects model we did not get any \(P\)-values despite summary reporting \(t\)-values (\(t\) as in \(T\)-test)
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: log_dbh_cm ~ Native_Status * hii + (1 | PlotID)
Data: dat
AIC BIC logLik -2*log(L) df.resid
66659.5 66711.3 -33323.8 66647.5 41447
Scaled residuals:
Min 1Q Median 3Q Max
-3.3415 -0.6791 -0.0499 0.5649 4.8794
Random effects:
Groups Name Variance Std.Dev.
PlotID (Intercept) 0.1367 0.3697
Residual 0.2818 0.5309
Number of obs: 41453, groups: PlotID, 530
Fixed effects:
Estimate Std. Error t value
(Intercept) 2.354021 0.050752 46.383
Native_Statusnative 0.619260 0.040328 15.356
hii 0.000556 0.002294 0.242
Native_Statusnative:hii -0.007436 0.001750 -4.249
Correlation of Fixed Effects:
(Intr) Ntv_St hii
Ntv_Sttsntv -0.603
hii -0.915 0.488
Ntv_Sttsnt: 0.540 -0.949 -0.478
Here again an LRT is the better option
Data: dat
Models:
reduced_mod: log_dbh_cm ~ Native_Status + hii + (1 | PlotID)
mixed_mod: log_dbh_cm ~ Native_Status * hii + (1 | PlotID)
npar AIC BIC logLik -2*log(L) Chisq Df Pr(>Chisq)
reduced_mod 5 66676 66719 -33333 66666
mixed_mod 6 66660 66711 -33324 66648 18.033 1 2.171e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1