Please enable JavaScript.

Coggle requires JavaScript to display documents.

12 The Analysis of Variance (One-way layout (Normal Theory; the F test…

- - - - normalized: \(\sum_{i=1}^{I}\alpha_{i}=0\)
    - - The errors are assumed to be independent ~ N(0, \(\sigma^2\))
    - - Thus, if \(\alpha_i=0\) for all \(i=1,...,I\), all treatments have the same expected response, and, in general, \(\alpha_i-\alpha_j\) is the difference between the expected values under treatments \(i\) and \(j\).
    - - \[\overline{Y}_{i.}=\frac{1}{J}\sum_{j=1}^{\textrm{J}}Y_{ij}\]
      - \[\overline{Y}_{..}=\frac{1}{IJ}\sum_{i=1}^{I}\sum_{j=1}^{J}Y_{ij}\]
    - - I det fallet borde alltså de två estimaten vara ungefär lika, \[\frac{SS_{W}}{I(J-1)}\approx\frac{SS_{B}}{I-1}\]
      - Eftersom vi vet nollfördelningen av kvoten \(SS_W/SS_B\) så är ju det en användbar teststatistika!
  - - - Tukey's method
        
        The idea is that all the differences are less than some number if and only if the largest difference is. -> a set of confidence intervals that hold simultaneously for all differences \(\mu_{u}-\mu_{v}\):
        
        Studentized range distribution \(SR(k, df)\) has two parameters: the number of samples \(k\), and the number
        of degrees of freedom used in the variance estimate \(s_p^2\)
        
        If \(I\) independent samples \((Y_{i1},...,Y_{iJ})\) taken from \(N(\mu_i, \sigma^2)\) have the same size \(J\), then the sample means \(\overline{Y}_{i.}\sim N(\mu_{i},\frac{\sigma^{2}}{J})\) are independent and\[\frac{\sqrt{J}}{s_{p}}\max_{u,v}\left|\overline{Y}_{u.}-\overline{Y}_{v.}-(\mu_{u}-\mu_{v})\right|\sim SR(I,I(J-1))\]
        
        the upper 100\(\alpha\) percentage point of the distribution is denoted by \(q_{\,I,\,I(J-1)}(\alpha)\)
      - The Bonferroni Method
        
        Warning: \({I}\choose{2}\) pairwise Anova comparisons are not independent as required by Bonferroni method
        
        indeed, assuming the null hypothesis is true, the number of positive results is \(X\sim\textrm{Bin}(k,\frac{\alpha}{k})\), and due to independence \(P(X\geq1|H_{0})=1-(1-\frac{\alpha}{k})^{k}\approx\alpha\) for small values of \(\alpha\).
        
        Think of \(k\) independent replications of a statistical test. The overall result is positive if we get at least one positive result
        among these \(k\) tests. The overall significance level α is obtained, if each single test is performed at significance level α/k:
        
        Simultaneous \(100(1-\alpha)%\) formula for \(I\choose 2\) pairwise differences \((\alpha_u-\alpha_v)\):\[(\overline{Y}_{u.}-\overline{Y}_{v.})\pm t_{I(J-1)}\left(\frac{\alpha}{I(I-1)}\right)\cdot s_{p}\sqrt{\frac{2}{J}}\]
        
        Flexibility of the formula: works for different sample sizes as well after replacing \(\sqrt{\frac{2}{J}}\) by \(\sqrt{\frac{1}{J_{u}}+\frac{1}{J_{v}}}\)
        
        The idea is very simple. If \(k\) null hypotheses are to be tested, a desired overall type I error rate of at most \(\alpha\) can be guaranteed by testing each null hypothesis at level \(\alpha/k\).
        Equivalently, if \(k\) confidence intervals are each formed to have confidence level \(100(1-\alpha/k)%\), they all hold simultaneously with confidence level at least \(100(1-\alpha)%\).
        The method is simple and versatile and, although crude, gives surprisingly good results if \(k\) is not too large.
- - - - \[SS_{A}=JK\sum_{i=1}^{I}(\overline{Y}_{i..}-\overline{Y}_{...})^{2}\]
      - \[SS_{\textrm{B}}=IK\sum_{j=1}^{J}(\overline{Y}_{.j.}-\overline{Y}_{...})^{2}\]
      - \[SS_{AB}=K\sum_{i=1}^{I}\sum_{j=1}^{J}(\overline{Y}_{ij.}-\overline{Y}_{i..}-\overline{Y}_{.j.}+\overline{Y}_{...})^{2}\]
      - \[SS_{\textrm{E}}=\sum_{i=1}^{\textrm{I}}\sum_{j=1}^{J}\sum_{k=1}^{K}(Y_{ijk}-\overline{Y}_{ij.})^{2}\]
      - \[SS_{TOT}=\sum_{i=1}^{I}\sum_{j=1}^{J}\sum_{k=1}^{K}(Y_{ijk}-\overline{Y}_{...})^{2}\]
- - - - \(E(MS_{A})=\sigma^{2}\textrm{+}J/(I-1)\sum_{i=1}^{I}\alpha_{i}^{2}\)
        \(E(MS_{B})=\sigma^{2}+I/(J-1)\sum_{j=1}^{J}\beta_{j}^{2}\)
        \(E(MS_{\textrm{AB}})=\sigma^{2}\)
      - Thus, \(\sigma^2\) can be estimated from \(MS_{AB}\). Also, since these mean squares are independently distributed, F tests can be performed to test \(H_A\) or \(H_B\).
        
        For example, to test \(H_{A}:\textrm{ alla }\alpha_{i}=0\), this statistic can be used:\[F=\frac{MS_{A}}{MS_{AB}}\]
        
        From #, under \(H_A\), the statistic follows F with \(I-1\) and \((I-1)(J-1)\) degrees of freedom
        
        \(H_B\) may be tested similarly but is not usually of interest.
        
        If there is and interaction, \(MS_{AB}\) will tend to overestimate \(\sigma^2\) -> F smaller than it should be -> the test is conservative (the actual probability of type I error is smaller than desired

ptukey(q, nmeans, df)
qtukey(p, nmeans, df)