Ch.6 Probability/Proportions

Rule of Large Numbers

Ch.7 Distribution of Sampling Means

Ch.8 Hypothesis Testing (two-tailed)

Establishes a connection between a sample and the population. Helps to make predictions

Probability of A = number of outcomes classified as A / number of possible outcomes

Ranges from 0 to 1

Requirements

Random Sampling

Independent Random Sampling (if more than one individual is selected)

Chances remain the same

Assures that selection is unbiased

Only true if both possible outcomes have an equal chance of happening

To maintain constant probabilities, must sample WITH replacement

Normal Distribution and shape

Y = (1 / √2πϭ)e ^-(X-µ)^2 / 2ϭ^2

Symmetrical with single mode in the middle

34.13% fall between mean and z = 1; 13.59% fall between z =1 and z = 2; and 2.28% fall between z = 2 and z = 3

Unit normal table

Lists the relationship between z score locations and proportions in a normal distribution

The same table (in Appendix B) is used for all normal distributions

To find proportions using X, first transform to z score

Used in inferential statistics by observing if there is an extreme value that has a low probability of occuring

an entire sample mean can be described with one z score

The larger the sample, the lower the standard error

Distribution of Sample Means

collection of means from ALL the possible samples in a population

Also called Sampling Distribution since it is a distribution of means (statistics) and not raw scores

Central Limit Theorem

For large populations where all possible samples are difficult to get

for a population with mean µ and standard deviation ϭ, the sample size n will have a mean of µ and a standard deviation of ϭ / √n

The expected value of M = the mean of the sample means and is near or matching the population mean

The standard error of M = the standard deviation of the distributions of sample means

If standard error (ϭm) is small = sample means are close together;
when standard error is large = means are scattered

measures how much distance is expected between M and µ

shows how well a sample represents the population

Standard Error of M ϭm = ϭ / √n

Rule of Large Numbers

Smallest sample n = 1, has a standard error ϭm = ϭ

For variance (ϭ^2) = ϭ = √ϭ^2
For standard error = ϭm = √ϭ^2 / n

Used to find the probability of selecting a sample with a specific mean

Example

what is the probability a sample of n = 16 have a score higher than 525 when µ = 500 and ϭ = 100?

1st- distribution is normal bc population is normal

2nd- distribution has M = 500 bc µ = 500

3rd-
for n = 16, standard error ϭm = 25 bc ϭm = ϭ / √n

4th-
locate position with z score; in this case with ϭm = 25 (used as regular ϭ), z score = +1

5th-
since it is a normal distribution, use Unit Normal Table in Appendix B to find probability for +1 (=.1587)

conclusion = only 15.87% of n = 16 would have a score above 525

Sampling Error

Difference between statistic and parameter. Because the sample is not a perfect representation of population

(not to be confused with Standard Error which describes the average distance from the sample mean to the population mean)

Method of defining the sampling error

natural discrepancies like this are present in all inferential statistics and are necessary to account for

If the change occurring in research is within the sampling error, it is not evidence of an effective intervention

Essentially this is inferential statistics. Evaluates a hypothesis about a population based on sample statistics

Step 1-
State the hypothesis
(and select Alpha level)

Step 2-
Set the criteria
(and locate critical region)

Step 3-
Collect data and compute statistics
(and obtain z score)

Step 4-
Make a decision
(reject or fail to reject H0)

Used to evaluate the results of a research study

Basic research situation

The µ of the population is known before treatment. The purpose is to see if the treatment has an effect on the population µ

Make one basic assumption, such as adding or subtracting a constant. The shape and standard deviation DO NOT CHANGE

The null hypothesis (H0) states that they treatment will produce no change or relationship. The original mean stays the same

The alternative hypothesis (H1) states that there is a change, difference, or relationship from the treatment. The original mean changes

consider which sample mean is consistent with H0

divide the distribution of sample means into two sections: likely for H0 and unlikely for H0

exact values for high and low probability

Alpha level = low probability (.001, .01, and .05 are common)

extremely unlikely values in the tails of the distribution (Alpha) = critical region

If research produces a sample mean in the critical region, H0 is rejected and H1 is accepted

use Alpha level probability and unit normal table (ex. a = .05, unit normal table for .025 on each side)

Raw data is used to create the sample mean that will be compared to the original mean

comparison uses z score
(M - µ) / ϭM
(µ is from H0)
(ϭM = ϭ / √n)

Determine is sample mean is in the critical region

if so, H0 is false (rejected) and change has occurred. There is good evidence to support H1

If sample mean in not in critical region (close to original mean), then "Fail to reject H0" which means treatment does not have an effect

Z score = Test Statistic = single specific statistic used to test hypothesis

large value of test statistic = large discrepancy in data (likely in critical region)

Errors

Type l

Type ll

When data leads to rejecting H0, but it actually does not have an effect (H0 is true even though sample statistic shows differently

Because samples are not identical to populations

Occurs when a researcher unknowingly obtains an extreme nonrepresentative sample

Alpha level is also the probability that a Type l error will occur

Probability is Alpha level

A treatment effect actually exists, but the sample does not detect it. Researchers fails to reject the H0

When sample mean is NOT in the critical region even though the treatment truly has an effect

No exact probability this error will occur

Represented by the beta symbol β

Not as serious as a Type l error

Error is always a possibility when working with samples

under control of researcher

How to select Alpha level

Select small level to minimize risk of error (largest accepted is .05)

Smaller risk also means more evidence is necessary to reject H0

Is reported in publications as "significant effect," "z = 2.40" and "p<.05" as examples

Distance between sample mean and population mean is most influential factor

Also variability and number in sample

High variability/standard deviation lowers chances of seeing patterns or finding treatment effect

small sample size decreases z score

Assumptions

Random sampling

Independent observations

ϭ is unchanged by treatment

Normal sample distribution

Directional hypotheses (one-tailed)

H0 and H1 specify if a treatment will increase or decrease the sample group instead of simply effecting it

Ex. H0- tips will not increase (no effect) µ < or = to 16;
H1- Tips will increase (will be positive effect) µ > 16

Critical region on one side instead of both sides

Step 1-
Find all possible sample means if Ho is TRUE.

Start with finding standard error with expected mean

Step 2-
Decide Alpha level

.001, .01, .05; then determine z score

Critical region is entirely on one side instead of split

Then it is the same as two-tailed

This means we can reject H0 when differences between means are smaller than with two-tailed

Concerns with hypothesis testing

Significant change does not always equal substantial change

It only says the results are "unlikely" if the treatment has no effect

Measuring Effect size

Cohen's d Cohen

Measures the difference between the two means

d = .2 - small effect;
d = .5 - medium effect;
d = .8 - large effect;
(mean difference of .2, .5, or .8 of a standard deviation)

Statistical power

mean difference / standard deviation =
µtreatment - µno treatment / ϭ

or the mean before treatment vs the mean afterward =
Mtreatment - µno treatment / ϭ

sample size is not considered in Cohen's d

The probability that the test will correctly reject H0 (or the probability that the test will have an effect)

Two outcomes possible-

fail to reject H0 (Type ll error; or
correctly reject H0 (the power or the power of the test)

Probability of Type ll error is still p = β; so probability of the power (outcome 2) is p = 1 - β

How likely the test will be successful

The same size (but in percentage) of the critical region if the test is a success