Multiple Proportions (Test of Independence)

Multiple Proportions
(Test of Independence)

Type of Plot

Stacked bar chart

qplot(x = explanatory,
data = dataframe,
fill = response,
geom = "bar")

Mosaic plot

mosaicplot(table(dataframe$explanatory, dataframe$response),
xlab = "Explanatory",
ylab = "Response",
main = "Explanatory vs Response")

Example Problem

Parameters: $P_1, P_2, \cdots, P_k$
Point Estimates: $\hat{P}_1, \hat{P}_2, \ldots, \hat{P}_k$

Hypothesis Test

Hypotheses:
$ H_0: P_1 = P_2 = \ldots = P_k $
$ H_a: $ At least one $ p_i $ is different
for $ i \in { 1, \ldots, k }$

Test Statistic Random Variable (Assuming $H_0$ is true):
$ X^2 = \sum_{\text{all cells}} \dfrac{\text{(observed - expected)}^2}{\text{expected}} \sim \chi^2 (df = [R - 1][C - 1]) $

where expected $ = \dfrac{\text{row } i \text{ total} \cdot \text{column } j \text{ total} }{\text{table total}} $,
$ R = $ number of rows, and
$ C = $ number of columns

Observed Test Statistic:
$ x_{obs}^2 $: Replace "observed" with values
in the observed table and "expected" with
appropriate expected count

$ \mathbf{\textit{P}}$-value:
$ \mathbb{P}(X^2 \ge x_{obs}^2) $

Conditions for Distributional Approximation (To $ \chi^2$) (Assuming $ H_0 $ is true):

Independent Observations
All expected cell counts $ \ge 5 $
Degrees of freedom $ \ge 2 $

Confidence Interval does not apply

Example Problem 2