Stat 4604: Lab 7

Adam Shen

November 23, 2022

Slide navigation

  • Type ? to bring up the presentation menu

  • Press o for the presentation overview (and jump to a specific slide)

  • To export slides to PDF, press e to activate PDF export mode, then print to PDF (note that some features may not function properly when exported to a PDF)

  • To copy a code block, hover over the top-right corner of the code block and click the clipboard icon

Setup

library(tibble)
library(readr)

migraine <- read_csv("./data/Pdata.csv")

N <- as.matrix(migraine["N"])
colnames(N) <- ""

X <- cbind(
  rep(1, nrow(migraine)),
  as.matrix(migraine[c("Trt", "sBMI")])
)
colnames(X)[1] <- "(Intercept)"

Model building via EM algorithm

  • In the last few labs, we considered the model:

N_{i} \,\sim\, \text{Poisson}(\mu_{i}), \quad \mu_{i} \,=\, \exp{\lbrace\beta_{0} + \beta_{1}x_{i1} + \beta_{2}x_{i2}\rbrace}.

  • This time, we will consider the overdispersed model:

N_{i} \,\sim\, \text{Poisson}(b_{i}\mu_{i}), \quad \mu_{i} \,=\, \exp{\lbrace\beta_{0} + \beta_{1}x_{i1} + \beta_{2}x_{i2}\rbrace},

where

  • b_{i} \,\sim\, \text{Gamma}(\tau^{-1},\, \tau)

  • It is assumed that the N_{i}s and b_{i}s are independent

Modify PoisGLM to accept an offset term

Task 1: Modify the PoisGLM function from the previous lab to accept an offset term.

Note that in doing this, we will also need to modify its helper functions, init and gradH.

Modify PoisGLM to accept an offset term

Note that:

\begin{align*} b_{i}\mu_{i} &= b_{i}\exp{\lbrace\beta_{0} + \beta_{1}x_{i1} + \beta_{2}x_{i2}\rbrace}\\ &= \exp{\lbrace\ln{(b_{i})}\rbrace}\exp{\lbrace\ln{\left(\exp{\lbrace\beta_{0} + \beta_{1}x_{i1} + \beta_{2}x_{i2}\rbrace}\right)}\rbrace}\\ &= \exp{\lbrace\ln{\left(b_{i}\right) \,+\, \beta_{0} + \beta_{1}x_{i1} + \beta_{2}x_{i2}}\rbrace} \end{align*}

  • In the last lab, all functions that calculated \mu_{i} used \exp{\lbrace\beta_{0} + \beta_{1}x_{i1} + \beta_{2}x_{i2}\rbrace}

  • Therefore, to modify all functions that calculated \mu_{i} to include an offset, we simply inject a \ln{(b_{i})} inside the exponential

Modify PoisGLM to accept an offset term

We start by modifying gradH.

gradH <- function(beta, response, design_matrix) {
  mu <- exp(design_matrix %*% beta)

  V <- mu |>
    drop() |>
    diag()

  gradient <- t(design_matrix) %*% (response - mu)

  hessian <- -t(design_matrix) %*% V %*% design_matrix

  list(
    gradient = gradient,
    hessian = hessian
  )
}

Modify PoisGLM to accept an offset term

We start by modifying gradH.

gradH <- function(beta, response, design_matrix, b=rep(1, nrow(response))) {
  offset <- as.matrix(b) |>
    log()
  
  mu <- exp(offset + design_matrix %*% beta)

  V <- mu |>
    drop() |>
    diag()

  gradient <- t(design_matrix) %*% (response - mu)

  hessian <- -t(design_matrix) %*% V %*% design_matrix

  list(
    gradient = gradient,
    hessian = hessian
  )
}

Assume that b is passed in as a vector. We will coerce it to a column matrix inside the function. By default, I will make it a vector of ones. As such, if b is not supplied, the results will be identical to those of the previous lab.

Modify PoisGLM to accept an offset term

Next, we need to modify init.

init <- function(response, design_matrix) {
  Z <- log(response + 0.1)

  solve(t(design_matrix) %*% design_matrix) %*% t(design_matrix) %*% Z
}

Modify PoisGLM to accept an offset term

Next, we need to modify init.

init <- function(response, design_matrix, b=rep(1, nrow(response))) {
  b <- as.matrix(b)

  Z <- log((response / b) + 0.1)

  solve(t(design_matrix) %*% design_matrix) %*% t(design_matrix) %*% Z
}

Again, assume b is supplied as a vector, and coerce it to a column matrix inside the function. As before, set the default value of b to be a vector of ones -- if nothing is supplied, the results will be identical to those of the previous lab.

Modify PoisGLM to accept an offset term

Finally, we modify the core function, PoisGLM.

PoisGLM <- function(response, design_matrix, tol=1e-8) {
  beta <- init(response, design_matrix)

  repeat {
    derivatives <- gradH(beta, response, design_matrix)

    if (norm(derivatives$gradient, type="2") < tol) {
      break
    }

    beta <- beta - solve(derivatives$hessian) %*% derivatives$gradient
  }

  derivatives <- gradH(beta, response, design_matrix)

  if (any(eigen(derivatives$hessian)$values > 0)) {
    stop("Hessian is not negative semidefinite.")
  }

  list(
    beta = beta,
    fisher_score = -derivatives$hessian
  )
}

Modify PoisGLM to accept an offset term

Finally, we modify the core function, PoisGLM.

PoisGLM <- function(response, design_matrix, b=rep(1, nrow(response)), tol=1e-8) {
  beta <- init(response, design_matrix, b)

  repeat {
    derivatives <- gradH(beta, response, design_matrix, b)

    if (norm(derivatives$gradient, type="2") < tol) {
      break
    }

    beta <- beta - solve(derivatives$hessian) %*% derivatives$gradient
  }

  derivatives <- gradH(beta, response, design_matrix, b)

  if (any(eigen(derivatives$hessian)$values > 0)) {
    stop("Hessian is not negative semidefinite.")
  }

  list(
    beta = beta,
    fisher_score = -derivatives$hessian
  )
}

The main modification that we make is pass the value of b supplied to PoisGLM to its helper functions, init and gradH.

Test PoisGLM

Reproducing the results from last time by not supplying anything to b.

PoisGLM(N, X)$beta
                      
(Intercept)  2.5449341
Trt         -0.5271167
sBMI         0.1274928

glm(N ~ ., data=migraine, family="poisson") |>
  coef()
(Intercept)         Trt        sBMI 
  2.5449341  -0.5271167   0.1274928 

Test PoisGLM

Now let's trying supplying some values to b.

set.seed(50)
test_b <- runif(nrow(migraine))

PoisGLM(N, X, test_b)$beta
                      
(Intercept)  3.4064354
Trt         -0.6470806
sBMI         0.1487946

glm(N ~ ., data=migraine, family="poisson", offset=log(test_b)) |>
  coef()
(Intercept)         Trt        sBMI 
  3.4064354  -0.6470806   0.1487946 

What if the b_{i}s are not known in advance?

Suppose that we do not know the values of b_{i} in advance, but we do know \tau, where

\textbf{E}(b_{i}\,|\,N_{i} = n_{i}) \,=\, \frac{1 + \tau n_{i}}{1 + \tau \mu_{i}}

Task 2: Write a function called nbtau that will fit the model using an EM algorithm for a fixed value of \tau.

Create nbtau

nbtau <- function(response, design_matrix, tau, tol=1e-8) {
  
}

Create nbtau

nbtau <- function(response, design_matrix, tau, tol=1e-8) {
  b <- rgamma(1, shape=1/tau, scale=tau) |>
    rep(nrow(response))
}

Assuming that \tau is a fixed known value, we can generate our initial estimates of b_{i}. We need to do this first because we need values for b_{i} in order to obtain our initial estimates of \beta.

Create nbtau

nbtau <- function(response, design_matrix, tau, tol=1e-8) {
  b <- rgamma(1, shape=1/tau, scale=tau) |>
    rep(nrow(response))
  old_beta <- init(response, design_matrix, b)
}

Finally, we initialize \beta.

Create nbtau

nbtau <- function(response, design_matrix, tau, tol=1e-8) {
  b <- rgamma(1, shape=1/tau, scale=tau) |>
    rep(nrow(response))
  old_beta <- init(response, design_matrix, b)
  
  repeat {
    offset <- as.matrix(b) |>
      log()
    mu <- exp(offset + design_matrix %*% old_beta)
  }
}

We initialize \mu_{i} at this time to make the code for the calculation of the E-Step a bit cleaner.

Create nbtau

nbtau <- function(response, design_matrix, tau, tol=1e-8) {
  b <- rgamma(1, shape=1/tau, scale=tau) |>
    rep(nrow(response))
  old_beta <- init(response, design_matrix, b)
  
  repeat {
    offset <- as.matrix(b) |>
      log()
    mu <- exp(offset + design_matrix %*% old_beta)
    
    b <- (1 + tau * response) / (1 + tau * mu)
  }
}

E-Step:

b_{i,\,\text{new}} \,=\, \textbf{E}(b_{i}\,|\,N_{i}=n_{i}) \,=\, \frac{1 + \tau n_{i}}{1 + \tau \mu_{i}}

Note that we are using * rather than %*% since tau is a single-value scalar.

Create nbtau

nbtau <- function(response, design_matrix, tau, tol=1e-8) {
  b <- rgamma(1, shape=1/tau, scale=tau) |>
    rep(nrow(response))
  old_beta <- init(response, design_matrix, b)
  
  repeat {
    offset <- as.matrix(b) |>
      log()
    mu <- exp(offset + design_matrix %*% old_beta)
    
    b <- (1 + tau * response) / (1 + tau * mu)
    
    new_beta <- init(response, design_matrix, b)
  }
}

M-Step:

Perform a Poisson log linear regression to get new \betas. Note that we can reuse our init function to do this calculation.

Create nbtau

nbtau <- function(response, design_matrix, tau, tol=1e-8) {
  b <- rgamma(1, shape=1/tau, scale=tau) |>
    rep(nrow(response))
  old_beta <- init(response, design_matrix, b)
  
  repeat {
    offset <- as.matrix(b) |>
      log()
    mu <- exp(offset + design_matrix %*% old_beta)
    
    b <- (1 + tau * response) / (1 + tau * mu)
    
    new_beta <- init(response, design_matrix, b)
    
    if (norm(new_beta - old_beta, type="2") < tol) {
      break
    }
  }
}

We now add an exit condition. We want to exit when

\lVert\beta_{\text{new}} - \beta_{\text{old}}\rVert_{2} \,<\, \varepsilon.

Create nbtau

nbtau <- function(response, design_matrix, tau, tol=1e-8) {
  b <- rgamma(1, shape=1/tau, scale=tau) |>
    rep(nrow(response))
  old_beta <- init(response, design_matrix, b)
  
  repeat {
    offset <- as.matrix(b) |>
      log()
    mu <- exp(offset + design_matrix %*% old_beta)
    
    b <- (1 + tau * response) / (1 + tau * mu)
    
    new_beta <- init(response, design_matrix, b)
    
    if (norm(new_beta - old_beta, type="2") < tol) {
      break
    }
    
    old_beta <- new_beta
  }
}

However, if we are not exiting, we should overwrite old_beta to the value of new_beta.

Create nbtau

nbtau <- function(response, design_matrix, tau, tol=1e-8) {
  b <- rgamma(1, shape=1/tau, scale=tau) |>
    rep(nrow(response))
  old_beta <- init(response, design_matrix, b)
  
  repeat {
    offset <- as.matrix(b) |>
      log()
    mu <- exp(offset + design_matrix %*% old_beta)
    
    b <- (1 + tau * response) / (1 + tau * mu)
    
    new_beta <- init(response, design_matrix, b)
    
    if (norm(new_beta - old_beta, type="2") < tol) {
      break
    }
    
    old_beta <- new_beta
  }
  
  drop(new_beta)
}

Once we have successfuly exited the loop, we should return the values of the \betas. I will return it as a vector rather than a matrix since they print a bit nicer.

Test nbtau

nbtau(N, X, tau=2)
(Intercept)         Trt        sBMI 
  2.2705637  -0.4267696   0.2010467 

nbtau(N, X, tau=3.2)
(Intercept)         Trt        sBMI 
  2.2762682  -0.4278403   0.1981796 

Test nbtau

Although there is some randomness in the initialization of the algorithm, we don't really need to set a seed because the results should converge to the same result. We can verify this by repeatedly calling nbtau.

nbtau(N, X, tau=2)
(Intercept)         Trt        sBMI 
  2.2705637  -0.4267696   0.2010467 
nbtau(N, X, tau=2)
(Intercept)         Trt        sBMI 
  2.2705637  -0.4267696   0.2010467 
nbtau(N, X, tau=2)
(Intercept)         Trt        sBMI 
  2.2705637  -0.4267696   0.2010467 

What if \tau is not a fixed value?

So far, we have considered the cases where:

  • b_{i}s are known

  • b_{i}s are not known, but depend on \tau which is a known fixed value

Now, we consider the case where b_{i}s are not known, and they depend on \tau which is also unknown and not a fixed value.

Task 3: Write a function called nbreg that will fit the model using an EM algorithm where \tau is unknown and not fixed.

Create nbreg

nbreg <- function(response, design_matrix, tol=1e-8) {
  
}

Create nbreg

nbreg <- function(response, design_matrix, tol=1e-8) {
  
}