Ch.6 Probability/Proportions
Ch.7 Distribution of Sampling Means
Ch.8 Hypothesis Testing (two-tailed)
Establishes a connection between a sample and the population. Helps to make predictions
Probability of A = number of outcomes classified as A / number of possible outcomes
Ranges from 0 to 1
Requirements
Random Sampling
Independent Random Sampling (if more than one individual is selected)
Chances remain the same
Assures that selection is unbiased
Only true if both possible outcomes have an equal chance of happening
To maintain constant probabilities, must sample WITH replacement
Normal Distribution and shape
Y = (1 / √2πϭ)e ^-(X-µ)^2 / 2ϭ^2
Symmetrical with single mode in the middle
34.13% fall between mean and z = 1; 13.59% fall between z =1 and z = 2; and 2.28% fall between z = 2 and z = 3
Unit normal table
Lists the relationship between z score locations and proportions in a normal distribution
The same table (in Appendix B) is used for all normal distributions
To find proportions using X, first transform to z score
Used in inferential statistics by observing if there is an extreme value that has a low probability of occuring
an entire sample mean can be described with one z score
The larger the sample, the lower the standard error
Distribution of Sample Means
collection of means from ALL the possible samples in a population
Also called Sampling Distribution since it is a distribution of means (statistics) and not raw scores
Central Limit Theorem
For large populations where all possible samples are difficult to get
for a population with mean µ and standard deviation ϭ, the sample size n will have a mean of µ and a standard deviation of ϭ / √n
The expected value of M = the mean of the sample means and is near or matching the population mean
The standard error of M = the standard deviation of the distributions of sample means
If standard error (ϭm) is small = sample means are close together;
when standard error is large = means are scattered
measures how much distance is expected between M and µ
shows how well a sample represents the population
Standard Error of M ϭm = ϭ / √n
Rule of Large Numbers
Smallest sample n = 1, has a standard error ϭm = ϭ
For variance (ϭ^2) = ϭ = √ϭ^2
For standard error = ϭm = √ϭ^2 / n
Used to find the probability of selecting a sample with a specific mean
Example
what is the probability a sample of n = 16 have a score higher than 525 when µ = 500 and ϭ = 100?
1st- distribution is normal bc population is normal
2nd- distribution has M = 500 bc µ = 500
3rd-
for n = 16, standard error ϭm = 25 bc ϭm = ϭ / √n
4th-
locate position with z score; in this case with ϭm = 25 (used as regular ϭ), z score = +1
5th-
since it is a normal distribution, use Unit Normal Table in Appendix B to find probability for +1 (=.1587)
conclusion = only 15.87% of n = 16 would have a score above 525
Sampling Error
Difference between statistic and parameter. Because the sample is not a perfect representation of population
(not to be confused with Standard Error which describes the average distance from the sample mean to the population mean)
Method of defining the sampling error
natural discrepancies like this are present in all inferential statistics and are necessary to account for
If the change occurring in research is within the sampling error, it is not evidence of an effective intervention
Essentially this is inferential statistics. Evaluates a hypothesis about a population based on sample statistics
Step 1-
State the hypothesis
(and select Alpha level)
Step 2-
Set the criteria
(and locate critical region)
Step 3-
Collect data and compute statistics
(and obtain z score)
Step 4-
Make a decision
(reject or fail to reject H0)
Used to evaluate the results of a research study
Basic research situation
The µ of the population is known before treatment. The purpose is to see if the treatment has an effect on the population µ
Make one basic assumption, such as adding or subtracting a constant. The shape and standard deviation DO NOT CHANGE
The null hypothesis (H0) states that they treatment will produce no change or relationship. The original mean stays the same
The alternative hypothesis (H1) states that there is a change, difference, or relationship from the treatment. The original mean changes
consider which sample mean is consistent with H0
divide the distribution of sample means into two sections: likely for H0 and unlikely for H0
exact values for high and low probability
Alpha level = low probability (.001, .01, and .05 are common)
extremely unlikely values in the tails of the distribution (Alpha) = critical region
If research produces a sample mean in the critical region, H0 is rejected and H1 is accepted
use Alpha level probability and unit normal table (ex. a = .05, unit normal table for .025 on each side)
Raw data is used to create the sample mean that will be compared to the original mean
comparison uses z score
(M - µ) / ϭM
(µ is from H0)
(ϭM = ϭ / √n)
Determine is sample mean is in the critical region
if so, H0 is false (rejected) and change has occurred. There is good evidence to support H1
If sample mean in not in critical region (close to original mean), then "Fail to reject H0" which means treatment does not have an effect
Z score = Test Statistic = single specific statistic used to test hypothesis
large value of test statistic = large discrepancy in data (likely in critical region)
Errors
Type l
Type ll
When data leads to rejecting H0, but it actually does not have an effect (H0 is true even though sample statistic shows differently
Because samples are not identical to populations
Occurs when a researcher unknowingly obtains an extreme nonrepresentative sample
Alpha level is also the probability that a Type l error will occur
Probability is Alpha level
A treatment effect actually exists, but the sample does not detect it. Researchers fails to reject the H0
When sample mean is NOT in the critical region even though the treatment truly has an effect
No exact probability this error will occur
Represented by the beta symbol β
Not as serious as a Type l error
.
Error is always a possibility when working with samples
under control of researcher
How to select Alpha level
Select small level to minimize risk of error (largest accepted is .05)
Smaller risk also means more evidence is necessary to reject H0
Is reported in publications as "significant effect," "z = 2.40" and "p<.05" as examples
Distance between sample mean and population mean is most influential factor
Also variability and number in sample
High variability/standard deviation lowers chances of seeing patterns or finding treatment effect
small sample size decreases z score
Assumptions
Random sampling
Independent observations
ϭ is unchanged by treatment
Normal sample distribution
Directional hypotheses (one-tailed)
H0 and H1 specify if a treatment will increase or decrease the sample group instead of simply effecting it
Ex. H0- tips will not increase (no effect) µ < or = to 16;
H1- Tips will increase (will be positive effect) µ > 16
Critical region on one side instead of both sides
Step 1-
Find all possible sample means if Ho is TRUE.
Start with finding standard error with expected mean
Step 2-
Decide Alpha level
.001, .01, .05; then determine z score
Critical region is entirely on one side instead of split
Then it is the same as two-tailed
This means we can reject H0 when differences between means are smaller than with two-tailed
Concerns with hypothesis testing
Significant change does not always equal substantial change
It only says the results are "unlikely" if the treatment has no effect
Measuring Effect size
Cohen's d Cohen
Measures the difference between the two means
d = .2 - small effect;
d = .5 - medium effect;
d = .8 - large effect;
(mean difference of .2, .5, or .8 of a standard deviation)
Statistical power
mean difference / standard deviation =
µtreatment - µno treatment / ϭ
or the mean before treatment vs the mean afterward =
Mtreatment - µno treatment / ϭ
sample size is not considered in Cohen's d
The probability that the test will correctly reject H0 (or the probability that the test will have an effect)
Two outcomes possible-
- fail to reject H0 (Type ll error; or
- correctly reject H0 (the power or the power of the test)
Probability of Type ll error is still p = β; so probability of the power (outcome 2) is p = 1 - β
How likely the test will be successful
The same size (but in percentage) of the critical region if the test is a success