title: “Determine the Standard Deviation from Raw Data” |
output: html_document |
Base R comes with a variety of numerical summaries, such as sd() or var(). The sd() and var() commands are the sample standard deviation and sample variance. Because R does not have a built-in function for the population standard deviation, the sample standard deviation must be multipled by
\(\sqrt{\frac{n-1}{n}}\)
The syntax for the sd command is
sd(df_name)
(a) Enter the data into R using the c(…) command (or load the data into R using read.csv). Then, find the population standard deviation.
Table8 <- c(82, 77, 90, 71, 62, 68, 74, 84, 94, 88)
There are 10 observations in the data set. So, multiply the sample standard deviation by
\(\sqrt{\frac{9}{10}}\)
sigma <- sqrt(9/10)*sd(Table8)
sigma
## [1] 9.81835
So, \(\sigma\) = 9.8.
(b) and (c) Now, let’s find a simple random sample of size 4 from the data in Table 8. Then, we will find the sample standard deviation of the data.
set.seed(14)
Sample_Table8 <- sample(Table8,4,replace=FALSE)
Sample_Table8
## [1] 94 88 90 71
sd(Sample_Table8)
## [1] 10.14479
So, s = 10.1.
If necessary, install the Mosaic package.
install.packages("mosaic")
Mosaic comes with a variety of numerical summaries, such as sd() or var(). The sd() and var() commands are the sample standard deviation and sample variance. Because Mosaic does not have a built-in function for the population standard deviation, the sample standard deviation must be multipled by
\(\sqrt{\frac{n-1}{n}}\)
The syntax for the sd command is
sd(~var_name,df_name)
First, we will manually enter the data in R. The Mosaic package assumes the data has column names, so we need to include the column name when building the data frame.
Table8 <- data.frame(score=c(82, 77, 90, 71, 62, 68, 74, 84, 94, 88))
Table8
## score
## 1 82
## 2 77
## 3 90
## 4 71
## 5 62
## 6 68
## 7 74
## 8 84
## 9 94
## 10 88
(a) Find the population standard deviation of the data from Table 8. (Example 3).
library(mosaic)
n <- nrow(Table8)
n
## [1] 10
sigma <- sqrt((n-1)/n)*sd(~score,data=Table8)
sigma
## [1] 9.81835
So, \(\sigma\) = 9.8.
(b) and (c) Now, let’s find a simple random sample of size 4 from the data in Table 8. Then, we will find the sample standad deviation of the data.
set.seed(14)
Sample_Table8 <- sample(Table8,4,replace=FALSE)
Sample_Table8
## score orig.id
## 9 94 9
## 10 88 10
## 3 90 3
## 4 71 4
sd(~score,data=Sample_Table8)
## [1] 10.14479
So, s = 10.1.