Multiple Proportions
(Test of Independence)

Type of Plot

Stacked bar chart

qplot(x = explanatory,
data = dataframe,
fill = response,
geom = "bar")

Mosaic plot

mosaicplot(table(dataframe$explanatory, dataframe$response),
xlab = "Explanatory",
ylab = "Response",
main = "Explanatory vs Response")

Parameters: P1,P2,,Pk
Point Estimates: ˆP1,ˆP2,,ˆPk

Hypothesis Test

Hypotheses:
\( H_0: P_1 = P_2 = \ldots = P_k \)
\( H_a: \) At least one \( p_i \) is different
for \( i \in { 1, \ldots, k }\)

Test Statistic Random Variable (Assuming \(H_0\) is true):
\( X^2 = \sum_{\text{all cells}} \dfrac{\text{(observed - expected)}^2}{\text{expected}} \sim \chi^2 (df = [R - 1][C - 1]) \)


where expected \( = \dfrac{\text{row } i \text{ total} \cdot \text{column } j \text{ total} }{\text{table total}} \),
\( R = \) number of rows, and
\( C = \) number of columns

Observed Test Statistic:
\( x_{obs}^2 \): Replace "observed" with values
in the observed table and "expected" with
appropriate expected count

\( \mathbf{\textit{P}}\)-value:
\( \mathbb{P}(X^2 \ge x_{obs}^2) \)

Conditions for Distributional Approximation (To \( \chi^2\)) (Assuming \( H_0 \) is true):

  1. Independent Observations
  2. All expected cell counts \( \ge 5 \)
  3. Degrees of freedom \( \ge 2 \)

Confidence Interval does not apply