Bootstrapping in R Programming (original) (raw)

Last Updated : 16 Dec, 2021

Bootstrapping is a technique used in inferential statistics that work on building random samples of single datasets again and again. Bootstrapping allows calculating measures such as mean, median, mode, confidence intervals, etc. of the sampling.

R - Bootstrapping

Following is the process of bootstrapping in R Programming Language:

Methods of Bootstrapping

There are 2 methods of bootstrapping:

Types of Confidence Intervals in Bootstrapping

Confidence Interval (CI) is a type of computational value calculated on sample data in statistics. It produces a range of values or an interval where the true value lies for sure. There are 5 types of confidence intervals in bootstrapping as follows:

\left(2 \widehat{\theta}-\theta_{(1-\alpha / 2)}^{*}, 2 \widehat{\theta}-\theta_{(\alpha / 2)}^{*}\right)

where,
\alpha represents confidence interval, mostly \alpha = 0.95
\theta^{*} represents bootstrapped coefficients
\theta_{(1-\alpha / 2)}^{*} represents 1-\alpha / 2 percentile of bootstrapped coefficients

\begin{array}{c} t_{0}-b \pm Z_{\alpha} \cdot \mathrm{se}^{*} \\ 2 t_{0}-t^{*} \pm Z_{\alpha} \cdot \mathrm{se}^{*} \end{array}
where,

t_{0} represents a value from dataset t
b is the bias of bootstrap estimate i.e.,

\mathbf{b}=\mathbf{t}^{*}-\mathbf{t}_{\mathrm{o}}
Z_{\alpha} represents 1-\alpha / 2 quantile of bootstrap distribution
se^{*} represents standard error oft^{*}

\left(\theta_{(\alpha / 2)}^{*}, \theta_{(1-\alpha / 2)}^{*}\right)

\left(\theta_{0}+\frac{\theta_{0}+\theta_{\alpha}}{1-a\left(\theta_{0}-\theta_{\alpha}\right)}, \theta_{0}+\frac{\theta_{0}+\theta_{(1-\alpha)}}{1-a\left(\theta_{0}-\theta_{(1-\alpha)}\right)}\right)

The syntax to perform bootstrapping in R programming is as follows:

Syntax: boot(data, statistic, R)

Parameters:

To learn about more optional arguments of boot() function, use below command:

help("boot")

Example:

R `

Library required for boot() function

install.packages("boot")

Load the library

library(boot)

Creating a function to pass into boot() function

bootFunc <- function(data, i){ df <- data[i, ] c(cor(df[, 2], df[, 3]), median(df[, 2]), mean(df[, 1]) ) }

b <- boot(mtcars, bootFunc, R = 100)

print(b)

Show all CI values

boot.ci(b, index = 1)

`

Output:

ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = mtcars, statistic = bootFunc, R = 100)

Bootstrap Statistics : original bias std. error t1* 0.9020329 -0.002195625 0.02104139 t2* 6.0000000 0.340000000 0.85540468 t3* 20.0906250 -0.110812500 0.96052824

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 100 bootstrap replicates

CALL : boot.ci(boot.out = b, index = 1)

Intervals : Level Normal Basic
95% ( 0.8592, 0.9375 ) ( 0.8612, 0.9507 )

Level Percentile BCa
95% ( 0.8534, 0.9429 ) ( 0.8279, 0.9280 )
Calculations and Intervals on Original Scale Some basic intervals may be unstable Some percentile intervals may be unstable Warning : BCa Intervals used Extreme Quantiles Some BCa intervals may be unstable Warning messages: 1: In boot.ci(b, index = 1) : bootstrap variances needed for studentized intervals 2: In norm.inter(t, adj.alpha) : extreme order statistics used as endpoints