Install the Mosaic package, if necessary.
install.packages("mosaic")
Enter the data in Table 12 from Section 12.2. Use the do() command. Notice the row variable is Pain and the column variable is Group. Name the data Table12_df.
Use the tally command to convert Table12_df into a contingency table using the tally command.
library(mosaic)
Table12_df <- rbind(
do(51)*data.frame(Pain="Yes",Group="Zocor"),
do(5)*data.frame(Pain="Yes",Group="Placebo"),
do(16)*data.frame(Pain="Yes",Group="Cholestyramine"),
do(1532)*data.frame(Pain="No",Group="Zocor"),
do(152)*data.frame(Pain="No",Group="Placebo"),
do(163)*data.frame(Pain="No",Group="Cholestyramine")
)
Table12 <- tally(~Pain+Group,data=Table12_df)
Use the xchisq.test on the contingency table in Table 12. Recall, xchisq.test gives expected counts, contribution to the the \(\chi^2\) test statistic, and residuals.
xchisq.test(Table12)
##
## Pearson's Chi-squared test
##
## data: x
## X-squared = 14.707, df = 2, p-value = 0.0006405
##
## 163 152 1532
## ( 172.28) ( 151.11) (1523.61)
## [ 0.5003] [ 0.0052] [ 0.0462]
## <-0.707> < 0.072> < 0.215>
##
## 16 5 51
## ( 6.72) ( 5.89) ( 59.39)
## [12.8339] [ 0.1346] [ 1.1862]
## < 3.582> <-0.367> <-1.089>
##
## key:
## observed
## (expected)
## [contribution to X-squared]
## <Pearson residual>
The test statistic is \(\chi^2_0 = 14.707\) and the P-value is 0.0006.
Now, let’s construct a conditional distribution from the Table 12 data frame. Use the tally command. Recall the syntax:
tally(~response variable|explanatory variable,margins=FALSE,format=“proportion”,data=data_frame)
We are treating Group as the explanatory variable (the column variable), so use
Table12_condition <- tally(~Pain|Group,margins=FALSE,format="proportion",data=Table12_df)
Table12_condition
## Group
## Pain Cholestyramine Placebo Zocor
## No 0.91061453 0.96815287 0.96778269
## Yes 0.08938547 0.03184713 0.03221731
A higher proportion of the cholestyramine patients experience abdominal pain (0.089).
Now that we have the conditional distribution, use the barplot command. The syntax is as follows:
barplot(df_name,beside=TRUE)
Note: cex.names decreases the font size of the labels. legend = TRUE adds a legend. ylim=c(0,1.2) adjusts the length of the y-axis so the legend does not overlay the graph. You should experiment with the limits until you are happy with the graph.
barplot(Table12_condition, beside = TRUE, cex.names = .7,legend=TRUE, ylim=c(0,1.2),main="Patients Reporting Abdominal Pain by Treatment", xlab = "Group", ylab = "Relative Frequency", col = c('#6897bb', '#c06723', '#baebae'))
Is there an association between political philosophy and whether one texts while at a red light? Open the SullivanStatsSurveyII data file to answer this question.
Survey <- read.csv("https://sullystats.github.io/Statistics6e/Data/SullivanStatsSurveyII.csv")
head(Survey,n=3)
## Response_id Gender Age Education Tax.Rate GenderIncomeInequality
## 1 290408 Female 19 Some College 10 No
## 2 290410 Female 18 Some College 10 Yes
## 3 290412 Female 21 Some College 10 Yes
## MinWageOpinion MinWageAmount Political.Philosophy Text RetirementDollars
## 1 Yes 10.0 Moderate Yes 1200000
## 2 Yes 9.0 Conservative No 350000
## 3 Yes 9.5 Liberal Yes 1000000
## RetirementAge DeathAge
## 1 65 90
## 2 61 105
## 3 60 90
Now, let’s build a contingency table using the variables “Political Philosophy” and “Text”.
ContTable <- tally(~Political.Philosophy+Text,data=Survey)
ContTable
## Text
## Political.Philosophy No Yes
## Conservative 18 19
## Liberal 18 11
## Moderate 39 29
xchisq.test(ContTable)
##
## Pearson's Chi-squared test
##
## data: x
## X-squared = 1.2953, df = 2, p-value = 0.5233
##
## 18 19
## (20.71) (16.29)
## [0.354] [0.450]
## <-0.60> < 0.67>
##
## 18 11
## (16.23) (12.77)
## [0.193] [0.245]
## < 0.44> <-0.49>
##
## 39 29
## (38.06) (29.94)
## [0.023] [0.030]
## < 0.15> <-0.17>
##
## key:
## observed
## (expected)
## [contribution to X-squared]
## <Pearson residual>
Survey_condition <- tally(~Text|Political.Philosophy,margins=FALSE,format="proportion",data=Survey)
barplot(Survey_condition, beside = TRUE, cex.names = .7,legend=TRUE, ylim=c(0,1.2),main="Do You Text While at Red Lights", xlab = "Political Philosophy", ylab = "Relative Frequency", col = c('#6897bb', '#c06723'))