Option 1 Use Excel or Google Sheets to create a csv file. The first column should be the class midpoint, \(x_i\), and the second column should be the frequency, \(f_i\). Be sure to title each column.
We will use the data in Table 14 of Section 3.3.
Table14 <- read.csv("https://sullystats.github.io/Statistics6e/Data/Chapter3/Table14.csv")
head(Table14,n=4)
## Midpoint Frequency
## 1 62.5 1
## 2 87.5 0
## 3 112.5 7
## 4 137.5 10
Option 2 Enter the data directly into R and create a data frame.
Midpoint <- c(62.5,87.5,112.5,137.5,162.5,187.5,212.5,237.5,262.5,287.5,312.5)
Freq <- c(1,0,7,10,5,4,13,4,5,0,1)
Table14a <- data.frame(Midpoint,Freq)
head(Table14a,n=4)
## Midpoint Freq
## 1 62.5 1
## 2 87.5 0
## 3 112.5 7
## 4 137.5 10
To find the mean and standard deviation from grouped data, we need to install a package called Weighted.Desc.Stat.
install.packages("Weighted.Desc.Stat")
Now, we will call the Weighted.Desc.Stat library and find the mean and standard deviation of the data in Table 14.
library(Weighted.Desc.Stat)
w.mean(Table14$Midpoint,Table14$Frequency) # Find the mean
## [1] 182.5
w.sd(Table14$Midpoint,Table14$Frequency) # Find the standard deviation
## [1] 53.61903
The weighted mean \(\overline{x}_w\) = $182.50.
The weighted standard deviation supplied is a population standard deviation. Because the data in Table 14 is sample data, we need to multiply the weighted standard deviation by \(\sqrt{\frac{n}{n - 1}}\). In this problem, n = 50, so multiply the weighted standard deviation by \(\sqrt{\frac{50}{49}}\).
w.sd(Table14$Midpoint,Table14$Frequency)*sqrt(50/49) # Find the sample standard deviation.
## [1] 54.1634
So, the weighted sample standard deviation is $54.16.