Adam Shen
November 16, 2022
Type ?
to bring up the presentation menu
Press o
for the presentation overview (and jump to a specific slide)
To export slides to PDF, press e
to activate PDF export mode, then print to PDF (note that some features may not function properly when exported to a PDF)
To copy a code block, hover over the top-right corner of the code block and click the clipboard icon
Today, we will be looking at fitting a Poisson GLM via Newton-Raphson.
init
functionWrite a function that takes \mathbf{N} and \mathbf{X} to initialize the Newton-Raphson algorithm.
In order to initialize the Newton-Raphson algorithm, we need starting points for \boldsymbol{\beta}. From the Week 8 lecture notes, it was recommended that a good starting point would be to use linear regression and regress \mathbf{Z} := \ln{(\mathbf{N} + 0.1)} against the covariates.
As such, we have:
\mathbf{Z} \,:=\, \ln{(\mathbf{N} + 0.1)}
and
\boldsymbol{\beta}_{0} \,=\, (\mathbf{X}'\mathbf{X})^{-1}\mathbf{X}'\mathbf{Z}
init
functionresponse
: the name of the response variable, as a symbol
design_matrix
: the name of the design matrix, as a symbol
init
functionWe calculate \mathbf{Z}, as recommended.
init
functionWe calculate our regression coefficients and return them, as required.
init
gradH
functionWrite a function that takes \mathbf{N}, \mathbf{X}, and \boldsymbol{\beta}, and computes and returns the gradient and the Hessian.
From the Week 8 lecture notes, it was shown that the gradient was given by:
g(\boldsymbol{\beta}) \,=\, \mathbf{X}'(\mathbf{N} - \boldsymbol{\mu}),
where
\boldsymbol{\mu} \,=\, \exp{\left(\mathbf{X}\boldsymbol{\beta}\right)}
gradH
functionWrite a function that takes \mathbf{N}, \mathbf{X}, and \boldsymbol{\beta}, and computes and returns the gradient and the Hessian.
From the end of the Week 7 lecture notes and the Week 8 lecture notes, it was shown that the Hessian was given by:
H(\boldsymbol{\beta}) \,=\, -\mathbf{X}'\mathbf{VX},
where
\mathbf{V} \,=\, \text{diag}(\boldsymbol{\mu}) \,=\, \text{diag}(\exp{(\mathbf{X}\boldsymbol{\beta})})
gradH
functionI'm going to make beta
the first argument of gradH
so that I can pipe betas into it (optional).
gradH
functionWe first initialize \boldsymbol{\mu} = \exp{(\mathbf{X}\boldsymbol{\beta})}.
gradH
functionWe then initialize \mathbf{V} by passing \boldsymbol{\mu} (which returns a n\times 1 matrix) to the drop
function to reduce this one-dimensional matrix into an ordinary vector. Finally, we pass this result to the diag
function, which creates a diagonal matrix using the supplied values as the diagonal elements.
gradH
functionFinally, we return the gradient and Hessian, as required.
gradH
PoisGLM
functionUsing the results of init
and gradH
, write a function called PoisGLM
that takes \mathbf{N} and \mathbf{X} and returns the MLEs for \boldsymbol{\beta} and the Fisher score.
From the Week 8 notes, to find MLEs,
Initialize \boldsymbol{\beta}_{0}.
Perform \boldsymbol{\beta}_{k} \,=\, \boldsymbol{\beta}_{k-1} \,-\, H^{-1}(\boldsymbol{\beta}_{k-1})\cdot g(\boldsymbol{\beta}_{k-1}).
Repeat until \lVert g(\boldsymbol{\beta}_{k-1})\rVert_{2} < \varepsilon.
Before returning our final results, verify that H(\boldsymbol{\beta}_{k}) is negative semi-definite, i.e. ensure that we have arrived at a local maximum. We can check this by checking that all of the Hessian's eigenvalues are \leq 0. (Recall in the univariate case, we checked that the second derivative was < 0 which indicated it was concave-down).
PoisGLM
functionAs required, the PoisGLM
will take in the response matrix and the design matrix (similar to our previous functions), as well as a tolerance value (set to 1e-8 by default).
PoisGLM
functionInitialize the starting points. I will overwrite this variable as I progress through the loop.
PoisGLM
functionUse a repeat
loop since we do not know in advance how many times we will need to iterate. It feels like it will be too messy to check all of our stopping conditions and stick them into a while (cond)
.
PoisGLM
functionCompute the gradient and Hessian using the current value of beta
. We will store this into a variable as we will need to access it twice to extract both the Hessian and the gradient matrices.
PoisGLM
functionCheck the stopping condition before computing the next iteration of betas. There is a built-in norm
function that we can use to compute the Euclidean norm of a matrix.
PoisGLM
functionIf the exiting condition was not met, we proceed with updating the betas.
PoisGLM
functionPoisGLM <- function(response, design_matrix, tol=1e-8) {
beta <- init(response, design_matrix)
repeat {
derivatives <- gradH(beta, response, design_matrix)
if (norm(derivatives$gradient, type="2") < tol) {
break
}
beta <- beta - solve(derivatives$hessian) %*% derivatives$gradient
}
derivatives <- gradH(beta, response, design_matrix)
if (any(eigen(derivatives$hessian)$values > 0)) {
stop("Hessian is not negative semidefinite.")
}
}
Once we have broken out of our loop, we update the derivatives one last time using our final estimate of beta. We also need to verify that our Hessian is negative semidefinite when evaluated at our final estimate before returning the betas.
PoisGLM
functionPoisGLM <- function(response, design_matrix, tol=1e-8) {
beta <- init(response, design_matrix)
repeat {
derivatives <- gradH(beta, response, design_matrix)
if (norm(derivatives$gradient, type="2") < tol) {
break
}
beta <- beta - solve(derivatives$hessian) %*% derivatives$gradient
}
derivatives <- gradH(beta, response, design_matrix)
if (any(eigen(derivatives$hessian)$values > 0)) {
stop("Hessian is not negative semidefinite.")
}
list(
beta = beta,
fisher_score = -derivatives$hessian
)
}
If the Hessian is negative semi-definite at our final estimate of beta, then we can return the betas and the Fisher score (in this case, equal to the negative Hessian).
PoisGLM
PoisGLM
(Intercept) Trt sBMI
(Intercept) 0.0031667041 -3.072452e-03 -4.053434e-04
Trt -0.0030724524 8.285479e-03 -2.018442e-05
sBMI -0.0004053434 -2.018442e-05 1.830045e-03