Stat 4604: Lab 2

Adam Shen

October 12, 2022

Slide navigation

  • Type ? to bring up the presentation menu

  • Press o for the presentation overview (and jump to a specific slide)

  • To export slides to PDF, press e to activate PDF export mode, then print to PDF (note that some features may not function properly when exported to a PDF)

  • To copy a code block, hover over the top-right corner of the code block and click the clipboard icon

Function of the day

Consider the cubic function

f(x) \,=\, x^{3} \,-\, 2x^{2} \,-\, 4, \quad x \in \mathbb{R}.

Task: write a function in R to compute f(x).

Write a function for f(x)

f <- function(x) {
  x^3 - 2*x^2 - 4
}

f(0)
[1] -4
f(c(-3, -1, 2, 5))
[1] -49  -7  -4  71

Write a function for f'(x)

f'(x) \,=\, 3x^{2} \,-\, 4x

fprime <- function(x) {
  3*x^2 - 4*x
}

fprime(0)
[1] 0
fprime(c(-3, 1, 2, 5))
[1] 39 -1  4 55

The central finite difference

  • Working out a derivative by hand can be tedious/difficult in some situations

  • Having a numerical approximation is quite useful

  • One approach is to use the central finite difference:

d(f,x,h) \,=\, \frac{f(x+h) \,-\, f(x-h)}{2h},

for h small.

Task: Write a function in R for the central finite difference.

Write a function for the central finite difference

cfd <- function(f, x, h) {
  if (any(h <= 0)) {
    stop("`h` needs to be greater than 0.")
  }
}

I want this function to be vectorized over h since this will be useful later on. As such, we must use any in the condition supplied to if.

Write a function for the central finite difference

cfd <- function(f, x, h) {
  if (any(h <= 0)) {
    stop("`h` needs to be greater than 0.")
  }
  
  (f(x + h) - f(x - h)) / (2 * h)
}

Test cfd

Let us test cfd using:

  • f(x) = x^{3} \,-\, 2x^{2} \,-\, 4
  • x = 2
  • h = 1 \times 10^{-8}

Test cfd

cfd(f=f, x=2, h=1e-8)
[1] 4

Compare this result to the "true" derivative value:

fprime(2)
[1] 4

Looks good!

Using cfd with a range of h values

Task: We wish to see what happens when we compute:

  • d(f,2,h) and
  • the absolute difference, |d(f,2,h) \,-\, f'(2)|

as h increments from 10^{-1} to 10^{-17} in powers of 10^{-1}.

The base-R approach

results_base <- data.frame(h = 10 ^ (-1:-17))

results_base
       h
1  1e-01
2  1e-02
3  1e-03
4  1e-04
5  1e-05
6  1e-06
7  1e-07
8  1e-08
9  1e-09
10 1e-10
11 1e-11
12 1e-12
13 1e-13
14 1e-14
15 1e-15
16 1e-16
17 1e-17

The base-R approach

results_base <- data.frame(h = 10 ^ (-1:-17)) |>
  transform(cfd_values = cfd(f=f, x=2, h=h))

results_base
       h cfd_values
1  1e-01   4.010000
2  1e-02   4.000100
3  1e-03   4.000001
4  1e-04   4.000000
5  1e-05   4.000000
6  1e-06   4.000000
7  1e-07   4.000000
8  1e-08   4.000000
9  1e-09   4.000000
10 1e-10   4.000000
11 1e-11   4.000000
12 1e-12   4.000356
13 1e-13   3.996803
14 1e-14   4.041212
15 1e-15   3.996803
16 1e-16   0.000000
17 1e-17   0.000000

The base-R approach

results_base <- data.frame(h = 10 ^ (-1:-17)) |>
  transform(cfd_values = cfd(f=f, x=2, h=h)) |>
  transform(abs_diff = abs(cfd_values - fprime(2)))

results_base
       h cfd_values     abs_diff
1  1e-01   4.010000 1.000000e-02
2  1e-02   4.000100 1.000000e-04
3  1e-03   4.000001 9.999992e-07
4  1e-04   4.000000 1.000045e-08
5  1e-05   4.000000 1.594316e-10
6  1e-06   4.000000 5.591119e-10
7  1e-07   4.000000 2.335469e-09
8  1e-08   4.000000 6.871880e-08
9  1e-09   4.000000 3.309615e-07
10 1e-10   4.000000 3.309615e-07
11 1e-11   4.000000 3.309615e-07
12 1e-12   4.000356 3.556023e-04
13 1e-13   3.996803 3.197111e-03
14 1e-14   4.041212 4.121181e-02
15 1e-15   3.996803 3.197111e-03
16 1e-16   0.000000 4.000000e+00
17 1e-17   0.000000 4.000000e+00

The tidyverse approach (setup)

If you do not already have the tidyverse on your device, you can install it using

install.packages("tidyverse")

If you only wish to install the required packages for today's demo, you can use

install.packages("tibble")
install.packages("dplyr")

Once the required packages are installed, load tibble and dplyr using

library(tibble)
library(dplyr)

The tidyverse approach

results_tidy <- tibble(h = 10 ^ (-1:-17))

results_tidy
# A tibble: 17 x 1
       h
   <dbl>
 1 1e- 1
 2 1e- 2
 3 1e- 3
 4 1e- 4
 5 1e- 5
 6 1e- 6
 7 1e- 7
 8 1e- 8
 9 1e- 9
10 1e-10
11 1e-11
12 1e-12
13 1e-13
14 1e-14
15 1e-15
16 1e-16
17 1e-17

The tidyverse approach

results_tidy <- tibble(h = 10 ^ (-1:-17)) %>%
  mutate(
    cfd_values = cfd(f=f, x=2, h=h),
    abs_diff = abs(cfd_values - fprime(2))
  )

results_tidy
# A tibble: 17 x 3
       h cfd_values abs_diff
   <dbl>      <dbl>    <dbl>
 1 1e- 1       4.01 1.00e- 2
 2 1e- 2       4.00 1.00e- 4
 3 1e- 3       4.00 1.00e- 6
 4 1e- 4       4.00 1.00e- 8
 5 1e- 5       4.00 1.59e-10
 6 1e- 6       4.00 5.59e-10
 7 1e- 7       4.00 2.34e- 9
 8 1e- 8       4.00 6.87e- 8
 9 1e- 9       4.00 3.31e- 7
10 1e-10       4.00 3.31e- 7
11 1e-11       4.00 3.31e- 7
12 1e-12       4.00 3.56e- 4
13 1e-13       4.00 3.20e- 3
14 1e-14       4.04 4.12e- 2
15 1e-15       4.00 3.20e- 3
16 1e-16       0    4   e+ 0
17 1e-17       0    4   e+ 0

What do we notice?

# A tibble: 2 x 3
      h cfd_values abs_diff
  <dbl>      <dbl>    <dbl>
1 1e-16          0        4
2 1e-17          0        4
  • For h = 10^{-16} and h = 10^{-17}, the value returned by the central finite difference is zero!

  • Why is this happening?

What's going on?!

h1 <- 1e-16

2 + h1
[1] 2
2 - h1
[1] 2

f(2 + h1)
[1] -4
f(2 - h1)
[1] -4
f(2 + h1) - f(2 - h1)
[1] 0
2 * h1
[1] 2e-16

What's going on?

  • The issue is occurring in the numerator of the central finite difference

  • As h gets very small, f(2 + h) and f(2 - h) will be nearly identical, resulting in a difference of zero

  • As such, the cfd function is returning a value of zero

Base-R vs Tidyverse

Did you spot the differences?

  • Tibbles are fancier data frames that have nice printing features
results_base
       h cfd_values     abs_diff
1  1e-01   4.010000 1.000000e-02
2  1e-02   4.000100 1.000000e-04
3  1e-03   4.000001 9.999992e-07
4  1e-04   4.000000 1.000045e-08
5  1e-05   4.000000 1.594316e-10
6  1e-06   4.000000 5.591119e-10
7  1e-07   4.000000 2.335469e-09
8  1e-08   4.000000 6.871880e-08
9  1e-09   4.000000 3.309615e-07
10 1e-10   4.000000 3.309615e-07
11 1e-11   4.000000 3.309615e-07
12 1e-12   4.000356 3.556023e-04
13 1e-13   3.996803 3.197111e-03
14 1e-14   4.041212 4.121181e-02
15 1e-15   3.996803 3.197111e-03
16 1e-16   0.000000 4.000000e+00
17 1e-17   0.000000 4.000000e+00
results_tidy
# A tibble: 17 x 3
       h cfd_values abs_diff
   <dbl>      <dbl>    <dbl>
 1 1e- 1       4.01 1.00e- 2
 2 1e- 2       4.00 1.00e- 4
 3 1e- 3       4.00 1.00e- 6
 4 1e- 4       4.00 1.00e- 8
 5 1e- 5       4.00 1.59e-10
 6 1e- 6       4.00 5.59e-10
 7 1e- 7       4.00 2.34e- 9
 8 1e- 8       4.00 6.87e- 8
 9 1e- 9       4.00 3.31e- 7
10 1e-10       4.00 3.31e- 7
11 1e-11       4.00 3.31e- 7
12 1e-12       4.00 3.56e- 4
13 1e-13       4.00 3.20e- 3
14 1e-14       4.04 4.12e- 2
15 1e-15       4.00 3.20e- 3
16 1e-16       0    4   e+ 0
17 1e-17       0    4   e+ 0

Did you spot the differences?

results_base <- data.frame(h = 10 ^ (-1:-17)) |>
  transform(cfd_values = cfd(f=f, x=2, h=h)) |>
  transform(abs_diff = abs(cfd_values - fprime(2)))
  • transform needed to be called twice because the creation of variables cannot use other variables created in the same call

  • In the above, in order to create abs_diff, it required that cfd_values already existed within the data set

results_tidy <- tibble(h = 10 ^ (-1:-17)) %>%
  mutate(
    cfd_values = cfd(f=f, x=2, h=h),
    abs_diff = abs(cfd_values - fprime(2))
  )
  • mutate creates variables sequentially

  • This means that we can create multiple variables within a single call to mutate and variables can depend on one another

Did you spot the differences?

  • The base-R pipe, |>, was a very recent addition to base-R

  • It was introduced in R 4.1 (2021) with additional functionality added in R 4.2 (2022), possibly still more functionalities to come

  • It does not have as many features as the Tidyverse pipe, %>%, which has been around since 2014

  • The Tidyverse pipe, %>%, becomes available upon loading either the tibble or dplyr packages, but originates from the magrittr package

  • For most basic use cases, it will not matter which pipe you use

  • My opinion: if you have already loaded packages such as tibble or dplyr, you might as well use the Tidyverse pipe...