title: “Determine the Standard Deviation from Raw Data”
output: html_document

Using Base R

Base R comes with a variety of numerical summaries, such as sd() or var(). The sd() and var() commands are the sample standard deviation and sample variance. Because R does not have a built-in function for the population standard deviation, the sample standard deviation must be multipled by

\(\sqrt{\frac{n-1}{n}}\)

The syntax for the sd command is

sd(df_name)

Examples 3 and 4 Computing a Population Standard Deviation and a Sample Standard Deviation

(a) Enter the data into R using the c(…) command (or load the data into R using read.csv). Then, find the population standard deviation.

Table8 <- c(82, 77, 90, 71, 62, 68, 74, 84, 94, 88)

There are 10 observations in the data set. So, multiply the sample standard deviation by

\(\sqrt{\frac{9}{10}}\)

sigma <- sqrt(9/10)*sd(Table8)
sigma
## [1] 9.81835

So, \(\sigma\) = 9.8.

(b) and (c) Now, let’s find a simple random sample of size 4 from the data in Table 8. Then, we will find the sample standard deviation of the data.

set.seed(14)
Sample_Table8 <- sample(Table8,4,replace=FALSE)
Sample_Table8
## [1] 94 88 90 71
sd(Sample_Table8)
## [1] 10.14479

So, s = 10.1.

Using Mosaic

If necessary, install the Mosaic package.

install.packages("mosaic")

Mosaic comes with a variety of numerical summaries, such as sd() or var(). The sd() and var() commands are the sample standard deviation and sample variance. Because Mosaic does not have a built-in function for the population standard deviation, the sample standard deviation must be multipled by

\(\sqrt{\frac{n-1}{n}}\)

The syntax for the sd command is

sd(~var_name,df_name)

Example 1 Computing the Population Standard Deviation and a Sample Standard Deviation

First, we will manually enter the data in R. The Mosaic package assumes the data has column names, so we need to include the column name when building the data frame.

Table8 <- data.frame(score=c(82, 77, 90, 71, 62, 68, 74, 84, 94, 88))
Table8
##    score
## 1     82
## 2     77
## 3     90
## 4     71
## 5     62
## 6     68
## 7     74
## 8     84
## 9     94
## 10    88

(a) Find the population standard deviation of the data from Table 8. (Example 3).

library(mosaic)
n <- nrow(Table8)
n
## [1] 10
sigma <- sqrt((n-1)/n)*sd(~score,data=Table8)
sigma
## [1] 9.81835

So, \(\sigma\) = 9.8.

(b) and (c) Now, let’s find a simple random sample of size 4 from the data in Table 8. Then, we will find the sample standad deviation of the data.

set.seed(14)
Sample_Table8 <- sample(Table8,4,replace=FALSE)
Sample_Table8
##    score orig.id
## 9     94       9
## 10    88      10
## 3     90       3
## 4     71       4
sd(~score,data=Sample_Table8)
## [1] 10.14479

So, s = 10.1.