class: language-r layout: true --- # Lab 8 — November 29 ```r library(tidyverse) ``` <PRE class="fansi fansi-message"><CODE>## -- <span style='font-weight: bold;'>Attaching packages</span> --------------------------------------- tidyverse 1.3.1 -- </CODE></PRE><PRE class="fansi fansi-message"><CODE>## <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>ggplot2</span> 3.3.6 <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>purrr </span> 0.3.4 ## <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>tibble </span> 3.1.6 <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>dplyr </span> 1.0.9 ## <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>tidyr </span> 1.2.0 <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>stringr</span> 1.4.0 ## <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>readr </span> 2.1.2 <span style='color: #00BB00;'>v</span> <span style='color: #0000BB;'>forcats</span> 0.5.1 </CODE></PRE><PRE class="fansi fansi-message"><CODE>## -- <span style='font-weight: bold;'>Conflicts</span> ------------------------------------------ tidyverse_conflicts() -- ## <span style='color: #BB0000;'>x</span> <span style='color: #0000BB;'>dplyr</span>::<span style='color: #00BB00;'>filter()</span> masks <span style='color: #0000BB;'>stats</span>::filter() ## <span style='color: #BB0000;'>x</span> <span style='color: #0000BB;'>dplyr</span>::<span style='color: #00BB00;'>lag()</span> masks <span style='color: #0000BB;'>stats</span>::lag() </CODE></PRE> ```r library(GGally) library(broom) library(performance) theme_set(theme_bw()) ``` --- # The `state` data set ```r ?state ``` - The `state` data set is divided into seven parts - The part that we are interested in is `state.x77` ```r data(state) ``` --- # The `state` data set ```r head(state.x77) ``` ``` ## Population Income Illiteracy Life Exp Murder HS Grad Frost Area ## Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708 ## Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432 ## Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417 ## Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945 ## California 21198 5114 1.1 71.71 10.3 62.6 20 156361 ## Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766 ``` ```r class(state.x77) ``` ``` ## [1] "matrix" "array" ``` - This data is in matrix format; we wish to convert it to a tibble before proceeding further - Note that the state names are the row names of the matrix, they aren't actually in a column --- # Converting the matrix to a tibble - Though not shown in the documentation for [as_tibble()](https://tibble.tidyverse.org/reference/as_tibble.html), you can actually supply the `rownames` argument to the `as_tibble()` method for matrices (documented [here](https://github.com/tidyverse/tibble/issues/288#issuecomment-334244077)) - The value that you would supply to the `rownames` argument is the name of the column that should contain the rownames - Since the row names are the states, let's use `state` as the name of the new column ```r state2 <- state.x77 %>% as_tibble(rownames="state") state2 ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 50 x 9</span> ## state Population Income Illiteracy `Life Exp` Murder `HS Grad` Frost Area ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'> 1</span> Alabama <span style='text-decoration: underline;'>3</span>615 <span style='text-decoration: underline;'>3</span>624 2.1 69.0 15.1 41.3 20 <span style='text-decoration: underline;'>50</span>708 ## <span style='color: #555555;'> 2</span> Alaska 365 <span style='text-decoration: underline;'>6</span>315 1.5 69.3 11.3 66.7 152 <span style='text-decoration: underline;'>566</span>432 ## <span style='color: #555555;'> 3</span> Arizona <span style='text-decoration: underline;'>2</span>212 <span style='text-decoration: underline;'>4</span>530 1.8 70.6 7.8 58.1 15 <span style='text-decoration: underline;'>113</span>417 ## <span style='color: #555555;'> 4</span> Arkans~ <span style='text-decoration: underline;'>2</span>110 <span style='text-decoration: underline;'>3</span>378 1.9 70.7 10.1 39.9 65 <span style='text-decoration: underline;'>51</span>945 ## <span style='color: #555555;'> 5</span> Califo~ <span style='text-decoration: underline;'>21</span>198 <span style='text-decoration: underline;'>5</span>114 1.1 71.7 10.3 62.6 20 <span style='text-decoration: underline;'>156</span>361 ## <span style='color: #555555;'> 6</span> Colora~ <span style='text-decoration: underline;'>2</span>541 <span style='text-decoration: underline;'>4</span>884 0.7 72.1 6.8 63.9 166 <span style='text-decoration: underline;'>103</span>766 ## <span style='color: #555555;'> 7</span> Connec~ <span style='text-decoration: underline;'>3</span>100 <span style='text-decoration: underline;'>5</span>348 1.1 72.5 3.1 56 139 <span style='text-decoration: underline;'>4</span>862 ## <span style='color: #555555;'> 8</span> Delawa~ 579 <span style='text-decoration: underline;'>4</span>809 0.9 70.1 6.2 54.6 103 <span style='text-decoration: underline;'>1</span>982 ## <span style='color: #555555;'> 9</span> Florida <span style='text-decoration: underline;'>8</span>277 <span style='text-decoration: underline;'>4</span>815 1.3 70.7 10.7 52.6 11 <span style='text-decoration: underline;'>54</span>090 ## <span style='color: #555555;'>10</span> Georgia <span style='text-decoration: underline;'>4</span>931 <span style='text-decoration: underline;'>4</span>091 2 68.5 13.9 40.6 60 <span style='text-decoration: underline;'>58</span>073 ## <span style='color: #555555;'># ... with 40 more rows</span> </CODE></PRE> --- # Extra beautification - Let's also rename some of the columns... ```r state2 <- state2 %>% rename_with(~str_to_lower(str_replace_all(.x, pattern=" ", replacement="_"))) state2 ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 50 x 9</span> ## state population income illiteracy life_exp murder hs_grad frost area ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'> 1</span> Alabama <span style='text-decoration: underline;'>3</span>615 <span style='text-decoration: underline;'>3</span>624 2.1 69.0 15.1 41.3 20 <span style='text-decoration: underline;'>50</span>708 ## <span style='color: #555555;'> 2</span> Alaska 365 <span style='text-decoration: underline;'>6</span>315 1.5 69.3 11.3 66.7 152 <span style='text-decoration: underline;'>566</span>432 ## <span style='color: #555555;'> 3</span> Arizona <span style='text-decoration: underline;'>2</span>212 <span style='text-decoration: underline;'>4</span>530 1.8 70.6 7.8 58.1 15 <span style='text-decoration: underline;'>113</span>417 ## <span style='color: #555555;'> 4</span> Arkansas <span style='text-decoration: underline;'>2</span>110 <span style='text-decoration: underline;'>3</span>378 1.9 70.7 10.1 39.9 65 <span style='text-decoration: underline;'>51</span>945 ## <span style='color: #555555;'> 5</span> California <span style='text-decoration: underline;'>21</span>198 <span style='text-decoration: underline;'>5</span>114 1.1 71.7 10.3 62.6 20 <span style='text-decoration: underline;'>156</span>361 ## <span style='color: #555555;'> 6</span> Colorado <span style='text-decoration: underline;'>2</span>541 <span style='text-decoration: underline;'>4</span>884 0.7 72.1 6.8 63.9 166 <span style='text-decoration: underline;'>103</span>766 ## <span style='color: #555555;'> 7</span> Connecticut <span style='text-decoration: underline;'>3</span>100 <span style='text-decoration: underline;'>5</span>348 1.1 72.5 3.1 56 139 <span style='text-decoration: underline;'>4</span>862 ## <span style='color: #555555;'> 8</span> Delaware 579 <span style='text-decoration: underline;'>4</span>809 0.9 70.1 6.2 54.6 103 <span style='text-decoration: underline;'>1</span>982 ## <span style='color: #555555;'> 9</span> Florida <span style='text-decoration: underline;'>8</span>277 <span style='text-decoration: underline;'>4</span>815 1.3 70.7 10.7 52.6 11 <span style='text-decoration: underline;'>54</span>090 ## <span style='color: #555555;'>10</span> Georgia <span style='text-decoration: underline;'>4</span>931 <span style='text-decoration: underline;'>4</span>091 2 68.5 13.9 40.6 60 <span style='text-decoration: underline;'>58</span>073 ## <span style='color: #555555;'># ... with 40 more rows</span> </CODE></PRE> --- # Fit a basic OLS model - We wish to fit a basic OLS model with life expectancy as the response and all other variables except `state` as predictors ```r full_model <- lm(life_exp ~ . - state, data=state2) summary(full_model) ``` ``` ## ## Call: ## lm(formula = life_exp ~ . - state, data = state2) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1.48895 -0.51232 -0.02747 0.57002 1.49447 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 7.094e+01 1.748e+00 40.586 < 2e-16 *** ## population 5.180e-05 2.919e-05 1.775 0.0832 . ## income -2.180e-05 2.444e-04 -0.089 0.9293 ## illiteracy 3.382e-02 3.663e-01 0.092 0.9269 ## murder -3.011e-01 4.662e-02 -6.459 8.68e-08 *** ## hs_grad 4.893e-02 2.332e-02 2.098 0.0420 * ## frost -5.735e-03 3.143e-03 -1.825 0.0752 . ## area -7.383e-08 1.668e-06 -0.044 0.9649 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.7448 on 42 degrees of freedom ## Multiple R-squared: 0.7362, Adjusted R-squared: 0.6922 ## F-statistic: 16.74 on 7 and 42 DF, p-value: 2.534e-10 ``` --- # Perform backward stepwise selection using AIC - You can use the `step()` function to perform stepwise selection to see the effect of adding or removing a predictor from your model - However, I prefer to alternate between `add1()` and `drop1()` in order to keep track of what I'm doing at the current step - Starting with the full model, we look at single term deletions ```r drop1(full_model, ~ .) %>% tidy() %>% slice_min(AIC) ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 1 x 5</span> ## term df sumsq rss AIC ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> area 1 0.001<span style='text-decoration: underline;'>09</span> 23.3 -<span style='color: #BB0000;'>24.2</span> </CODE></PRE> - Remember: low AIC = good, hence the `slice_min()` - We drop `area` from our model ```r step1 <- update(full_model, . ~ . - area) ``` --- # Perform backward stepwise selection using AIC - Can we add anything back? ```r add1(step1, ~ . + area) %>% tidy() %>% slice_min(AIC) ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 1 x 5</span> ## term df sumsq rss AIC ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> <none> <span style='color: #BB0000;'>NA</span> <span style='color: #BB0000;'>NA</span> 23.3 -<span style='color: #BB0000;'>24.2</span> </CODE></PRE> - We already saw that the model **without** `area` has a lower AIC, therefore we will continue to drop predictors ```r drop1(step1, ~ .) %>% tidy() %>% slice_min(AIC) ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 1 x 5</span> ## term df sumsq rss AIC ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> illiteracy 1 0.003<span style='text-decoration: underline;'>76</span> 23.3 -<span style='color: #BB0000;'>26.2</span> </CODE></PRE> - Let's drop `illiteracy` from our model ```r step2 <- update(step1, . ~ . - illiteracy) ``` --- # Perform backward stepwise selection using AIC - Can we add anything back? ```r add1(step2, ~ . + area + illiteracy) %>% tidy() %>% slice_min(AIC) ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 1 x 5</span> ## term df sumsq rss AIC ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> <none> <span style='color: #BB0000;'>NA</span> <span style='color: #BB0000;'>NA</span> 23.3 -<span style='color: #BB0000;'>26.2</span> </CODE></PRE> - We cannot, so we continue with dropping ```r drop1(step2, ~ .) %>% tidy() %>% slice_min(AIC) ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 1 x 5</span> ## term df sumsq rss AIC ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> income 1 0.006<span style='text-decoration: underline;'>06</span> 23.3 -<span style='color: #BB0000;'>28.2</span> </CODE></PRE> ```r step3 <- update(step2, . ~ . - income) ``` --- # Perform backward stepwise selection using AIC - Can we add anything back? ```r add1(step3, ~ . + area + illiteracy + income) %>% tidy() %>% slice_min(AIC) ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 1 x 5</span> ## term df sumsq rss AIC ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> <none> <span style='color: #BB0000;'>NA</span> <span style='color: #BB0000;'>NA</span> 23.3 -<span style='color: #BB0000;'>28.2</span> </CODE></PRE> - We cannot, so we continue with dropping ```r drop1(step3, ~ .) %>% tidy() %>% slice_min(AIC) ``` <PRE class="fansi fansi-output"><CODE>## <span style='color: #555555;'># A tibble: 1 x 5</span> ## term df sumsq rss AIC ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> <none> <span style='color: #BB0000;'>NA</span> <span style='color: #BB0000;'>NA</span> 23.3 -<span style='color: #BB0000;'>28.2</span> </CODE></PRE> - The model with the lowest AIC is the one that does not drop any more terms; `step3` is the final model --- # Perform backward stepwise selection using AIC Our final model: ```r summary(step3) ``` ``` ## ## Call: ## lm(formula = life_exp ~ population + murder + hs_grad + frost, ## data = state2) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1.47095 -0.53464 -0.03701 0.57621 1.50683 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 7.103e+01 9.529e-01 74.542 < 2e-16 *** ## population 5.014e-05 2.512e-05 1.996 0.05201 . ## murder -3.001e-01 3.661e-02 -8.199 1.77e-10 *** ## hs_grad 4.658e-02 1.483e-02 3.142 0.00297 ** ## frost -5.943e-03 2.421e-03 -2.455 0.01802 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.7197 on 45 degrees of freedom ## Multiple R-squared: 0.736, Adjusted R-squared: 0.7126 ## F-statistic: 31.37 on 4 and 45 DF, p-value: 1.696e-12 ``` --- # Check the linear model assumptions Recall that we can check the assumptions using ```r par(mfrow = c(2, 2)) plot(step3, which=c(1, 2, 3, 5)) ``` or by using `performance::check_model()` ```r check_model(step3, check=c("vif", "qq", "normality", "linearity", "homogeneity", "outliers")) ``` --- # Check the linear model assumptions <img src="index_files/figure-html/unnamed-chunk-21-1.svg" style="display: block; margin: auto;" /> --- # Check the linear model assumptions - It appears there are some issues with the linearity assumption - This line is not very flat — there is a slight dip around `\(x \approx 71\)` - There seems to be some issues with the constant variance assumption as well - This line also does a dip near `\(x \approx 70.5\)` --- # Make a scatterplot matrix - Let's make a scatterplot matrix containing the fitted values, the predictors, and the residuals ```r step3 %>% augment() %>% select(.fitted, population, murder, hs_grad, frost, .resid) %>% ggpairs() ``` --- # Make a scatterplot matrix - Let's make a scatterplot matrix containing the fitted values, the predictors, and the residuals <img src="index_files/figure-html/unnamed-chunk-23-1.svg" style="display: block; margin: auto;" /> --- # Make a scatterplot matrix - The behaviour of the residual vs fitted plot (row 6 column 1) and the residual vs murder plot (row 6 column 3) appear similar - Therefore, let us regress the residuals upon murder --- # Linear model of residuals against murder ```r resid_murder_data <- step3 %>% augment() %>% select(.resid, murder) resid_murder_model <- lm(abs(.resid) ~ murder, data=resid_murder_data) ``` --- # Perform weighted least squares regression - To extend the linear model obtained via backward stepwise selection, we wish to perform weighted least squares regression - The previous model of fitting the absolute value of the residuals against the predictor `murder` provides an estimate of the standard deviation, `\(\sigma_{i}\)` - The weights are obtained by reciprocating the square of the fitted values obtained from this model ```r w <- 1 / predict(resid_murder_model)^2 head(w) ``` ``` ## 1 2 3 4 5 6 ## 4.328989 3.630323 3.126332 3.444607 3.474556 3.001557 ``` --- # Perform weighted least squares regression - We can fit a weighted least squares model using the `lm()` function, and supplying the weights to the `weights` argument ```r weighted_model <- lm(life_exp ~ population + murder + hs_grad + frost, data=state2, weights=w) summary(weighted_model) ``` ``` ## ## Call: ## lm(formula = life_exp ~ population + murder + hs_grad + frost, ## data = state2, weights = w) ## ## Weighted Residuals: ## Min 1Q Median 3Q Max ## -2.32866 -0.93775 -0.08347 1.04909 2.58497 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 7.109e+01 9.254e-01 76.824 < 2e-16 *** ## population 5.337e-05 2.400e-05 2.224 0.0312 * ## murder -3.012e-01 3.594e-02 -8.383 9.62e-11 *** ## hs_grad 4.554e-02 1.451e-02 3.137 0.0030 ** ## frost -6.095e-03 2.406e-03 -2.533 0.0149 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1.26 on 45 degrees of freedom ## Multiple R-squared: 0.7463, Adjusted R-squared: 0.7237 ## F-statistic: 33.09 on 4 and 45 DF, p-value: 7.067e-13 ``` --- # Comparing the coefficients ```r list( OLS = tidy(step3), WLS = tidy(weighted_model) ) ``` <PRE class="fansi fansi-output"><CODE>## $OLS ## <span style='color: #555555;'># A tibble: 5 x 5</span> ## term estimate std.error statistic p.value ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> (Intercept) 71.0 0.953 74.5 8.61<span style='color: #555555;'>e</span><span style='color: #BB0000;'>-49</span> ## <span style='color: #555555;'>2</span> population 0.000<span style='text-decoration: underline;'>050</span>1 0.000<span style='text-decoration: underline;'>025</span>1 2.00 5.20<span style='color: #555555;'>e</span><span style='color: #BB0000;'>- 2</span> ## <span style='color: #555555;'>3</span> murder -<span style='color: #BB0000;'>0.300</span> 0.036<span style='text-decoration: underline;'>6</span> -<span style='color: #BB0000;'>8.20</span> 1.77<span style='color: #555555;'>e</span><span style='color: #BB0000;'>-10</span> ## <span style='color: #555555;'>4</span> hs_grad 0.046<span style='text-decoration: underline;'>6</span> 0.014<span style='text-decoration: underline;'>8</span> 3.14 2.97<span style='color: #555555;'>e</span><span style='color: #BB0000;'>- 3</span> ## <span style='color: #555555;'>5</span> frost -<span style='color: #BB0000;'>0.005</span><span style='color: #BB0000; text-decoration: underline;'>94</span> 0.002<span style='text-decoration: underline;'>42</span> -<span style='color: #BB0000;'>2.46</span> 1.80<span style='color: #555555;'>e</span><span style='color: #BB0000;'>- 2</span> ## ## $WLS ## <span style='color: #555555;'># A tibble: 5 x 5</span> ## term estimate std.error statistic p.value ## <span style='color: #555555; font-style: italic;'><chr></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> <span style='color: #555555; font-style: italic;'><dbl></span> ## <span style='color: #555555;'>1</span> (Intercept) 71.1 0.925 76.8 2.24<span style='color: #555555;'>e</span><span style='color: #BB0000;'>-49</span> ## <span style='color: #555555;'>2</span> population 0.000<span style='text-decoration: underline;'>053</span>4 0.000<span style='text-decoration: underline;'>024</span>0 2.22 3.12<span style='color: #555555;'>e</span><span style='color: #BB0000;'>- 2</span> ## <span style='color: #555555;'>3</span> murder -<span style='color: #BB0000;'>0.301</span> 0.035<span style='text-decoration: underline;'>9</span> -<span style='color: #BB0000;'>8.38</span> 9.62<span style='color: #555555;'>e</span><span style='color: #BB0000;'>-11</span> ## <span style='color: #555555;'>4</span> hs_grad 0.045<span style='text-decoration: underline;'>5</span> 0.014<span style='text-decoration: underline;'>5</span> 3.14 3.00<span style='color: #555555;'>e</span><span style='color: #BB0000;'>- 3</span> ## <span style='color: #555555;'>5</span> frost -<span style='color: #BB0000;'>0.006</span><span style='color: #BB0000; text-decoration: underline;'>10</span> 0.002<span style='text-decoration: underline;'>41</span> -<span style='color: #BB0000;'>2.53</span> 1.49<span style='color: #555555;'>e</span><span style='color: #BB0000;'>- 2</span> </CODE></PRE> --- # Comparing the coefficients Computing the ratio of the coefficients between the two models: ```r coef(step3) / coef(weighted_model) ``` ``` ## (Intercept) population murder hs_grad frost ## 0.9990814 0.9394284 0.9963515 1.0229026 0.9750663 ``` - Since the ratio of the estimated coefficients do not differ substantially we do not need to consider iteratively reweighted least squares - From page 426 of the textbook: > *If the estimated coefficients differ substantially from the estimated regression coefficients obtained by ordinary least squares, it is usually advisable to iterate the weighted least squares process by using the residuals from the weighted least squares fit to re-estimate the variance or standard deviation function and then obtain revised weights: Often one or two iterations are sufficient to stabilize the estimated regression coefficients. This iteration process is often called iteratively reweighted least squares.*