We are going to work with the data in Table 1 from Section 11.2A.
First, install the Mosaic package, if necessary.
install.packages("mosaic")
Read the data from Table 1 into R.
Table1 <- read.csv("https://sullystats.github.io/Statistics6e/Data/Chapter11/Table1.csv")
head(Table1,n=3)
## Dominant Nondominant
## 1 0.177 0.179
## 2 0.210 0.202
## 3 0.186 0.208
First use the transform command to obtain the differences and store the result in a new data frame. Call the new variable differences.
Table1_diff <- transform(Table1,difference=Dominant-Nondominant)
head(Table1_diff)
## Dominant Nondominant difference
## 1 0.177 0.179 -0.002
## 2 0.210 0.202 0.008
## 3 0.186 0.208 -0.022
## 4 0.189 0.184 0.005
## 5 0.198 0.215 -0.017
## 6 0.194 0.193 0.001
Following along with Example 1, we want to test
\(H_0:\mu_d = 0\)
\(H_1:\mu_d < 0\)
where difference = Dominant - Nondominant
library(mosaic)
mean(~difference,data=Table1_diff)
## [1] -0.01316667
We must add 0.0132 to each observation so that the mean of the differenced data is 0.
Table1_adj <- transform(Table1_diff,diff_adj = difference + 0.0132)
mean(~diff_adj,data=Table1_adj)
## [1] 3.333333e-05
The mean of the adjusted data is very close to 0 (which is the goal). Now, letโs find 2000 resamples of this data, find the mean of each resample, and display the first four bootstrap means.
set.seed(43) #Use a seed so everyone gets the same results.
bootstrap <- do(2000)*mean(~diff_adj,data=resample(Table1_adj)) #Find 2000 bootstrap means
head(bootstrap,n=4)
## mean
## 1 -0.002216667
## 2 0.002450000
## 3 -0.003633333
## 4 0.006616667
Now, find the proportion of bootstrap means that are -0.0132 or less.
prop(~(mean <= -0.0132),data=bootstrap)
## prop_TRUE
## 0.0025
The estimated P-value is 0.0025. The P-value using Students t-distribution is 0.009.