Adam Shen
November 16, 2022
Type ? to bring up the presentation menu
Press o for the presentation overview (and jump to a specific slide)
To export slides to PDF, press e to activate PDF export mode, then print to PDF (note that some features may not function properly when exported to a PDF)
To copy a code block, hover over the top-right corner of the code block and click the clipboard icon
Today, we will be looking at fitting a Poisson GLM via Newton-Raphson.
init functionWrite a function that takes \mathbf{N} and \mathbf{X} to initialize the Newton-Raphson algorithm.
In order to initialize the Newton-Raphson algorithm, we need starting points for \boldsymbol{\beta}. From the Week 8 lecture notes, it was recommended that a good starting point would be to use linear regression and regress \mathbf{Z} := \ln{(\mathbf{N} + 0.1)} against the covariates.
As such, we have:
\mathbf{Z} \,:=\, \ln{(\mathbf{N} + 0.1)}
and
\boldsymbol{\beta}_{0} \,=\, (\mathbf{X}'\mathbf{X})^{-1}\mathbf{X}'\mathbf{Z}
init functionresponse: the name of the response variable, as a symbol
design_matrix: the name of the design matrix, as a symbol
init functionWe calculate \mathbf{Z}, as recommended.
init functionWe calculate our regression coefficients and return them, as required.
initgradH functionWrite a function that takes \mathbf{N}, \mathbf{X}, and \boldsymbol{\beta}, and computes and returns the gradient and the Hessian.
From the Week 8 lecture notes, it was shown that the gradient was given by:
g(\boldsymbol{\beta}) \,=\, \mathbf{X}'(\mathbf{N} - \boldsymbol{\mu}),
where
\boldsymbol{\mu} \,=\, \exp{\left(\mathbf{X}\boldsymbol{\beta}\right)}
gradH functionWrite a function that takes \mathbf{N}, \mathbf{X}, and \boldsymbol{\beta}, and computes and returns the gradient and the Hessian.
From the end of the Week 7 lecture notes and the Week 8 lecture notes, it was shown that the Hessian was given by:
H(\boldsymbol{\beta}) \,=\, -\mathbf{X}'\mathbf{VX},
where
\mathbf{V} \,=\, \text{diag}(\boldsymbol{\mu}) \,=\, \text{diag}(\exp{(\mathbf{X}\boldsymbol{\beta})})
gradH functionI'm going to make beta the first argument of gradH so that I can pipe betas into it (optional).
gradH functionWe first initialize \boldsymbol{\mu} = \exp{(\mathbf{X}\boldsymbol{\beta})}.
gradH functionWe then initialize \mathbf{V} by passing \boldsymbol{\mu} (which returns a n\times 1 matrix) to the drop function to reduce this one-dimensional matrix into an ordinary vector. Finally, we pass this result to the diag function, which creates a diagonal matrix using the supplied values as the diagonal elements.
gradH functionFinally, we return the gradient and Hessian, as required.
gradHPoisGLM functionUsing the results of init and gradH, write a function called PoisGLM that takes \mathbf{N} and \mathbf{X} and returns the MLEs for \boldsymbol{\beta} and the Fisher score.
From the Week 8 notes, to find MLEs,
Initialize \boldsymbol{\beta}_{0}.
Perform \boldsymbol{\beta}_{k} \,=\, \boldsymbol{\beta}_{k-1} \,-\, H^{-1}(\boldsymbol{\beta}_{k-1})\cdot g(\boldsymbol{\beta}_{k-1}).
Repeat until \lVert g(\boldsymbol{\beta}_{k-1})\rVert_{2} < \varepsilon.
Before returning our final results, verify that H(\boldsymbol{\beta}_{k}) is negative semi-definite, i.e. ensure that we have arrived at a local maximum. We can check this by checking that all of the Hessian's eigenvalues are \leq 0. (Recall in the univariate case, we checked that the second derivative was < 0 which indicated it was concave-down).
PoisGLM functionAs required, the PoisGLM will take in the response matrix and the design matrix (similar to our previous functions), as well as a tolerance value (set to 1e-8 by default).
PoisGLM functionInitialize the starting points. I will overwrite this variable as I progress through the loop.
PoisGLM functionUse a repeat loop since we do not know in advance how many times we will need to iterate. It feels like it will be too messy to check all of our stopping conditions and stick them into a while (cond).
PoisGLM functionCompute the gradient and Hessian using the current value of beta. We will store this into a variable as we will need to access it twice to extract both the Hessian and the gradient matrices.
PoisGLM functionCheck the stopping condition before computing the next iteration of betas. There is a built-in norm function that we can use to compute the Euclidean norm of a matrix.
PoisGLM functionIf the exiting condition was not met, we proceed with updating the betas.
PoisGLM functionPoisGLM <- function(response, design_matrix, tol=1e-8) {
beta <- init(response, design_matrix)
repeat {
derivatives <- gradH(beta, response, design_matrix)
if (norm(derivatives$gradient, type="2") < tol) {
break
}
beta <- beta - solve(derivatives$hessian) %*% derivatives$gradient
}
derivatives <- gradH(beta, response, design_matrix)
if (any(eigen(derivatives$hessian)$values > 0)) {
stop("Hessian is not negative semidefinite.")
}
}Once we have broken out of our loop, we update the derivatives one last time using our final estimate of beta. We also need to verify that our Hessian is negative semidefinite when evaluated at our final estimate before returning the betas.
PoisGLM functionPoisGLM <- function(response, design_matrix, tol=1e-8) {
beta <- init(response, design_matrix)
repeat {
derivatives <- gradH(beta, response, design_matrix)
if (norm(derivatives$gradient, type="2") < tol) {
break
}
beta <- beta - solve(derivatives$hessian) %*% derivatives$gradient
}
derivatives <- gradH(beta, response, design_matrix)
if (any(eigen(derivatives$hessian)$values > 0)) {
stop("Hessian is not negative semidefinite.")
}
list(
beta = beta,
fisher_score = -derivatives$hessian
)
}If the Hessian is negative semi-definite at our final estimate of beta, then we can return the betas and the Fisher score (in this case, equal to the negative Hessian).
PoisGLMPoisGLM (Intercept) Trt sBMI
(Intercept) 0.0031667041 -3.072452e-03 -4.053434e-04
Trt -0.0030724524 8.285479e-03 -2.018442e-05
sBMI -0.0004053434 -2.018442e-05 1.830045e-03