Please enable JavaScript.
Coggle requires JavaScript to display documents.
t-statistic & correlation and regression, Chapter 14, Chapter 9 -…
t-statistic & correlation and regression
t-statistic
reveals the distance in standard deviations between the sample means
an alternative to z score
used to test hypotheses about an unknown population mean when the value of s is unknown
t = m-u/ sm
t = sample mean (from the data) - population mean (hypothesized by H0) / estimated standard error (computed from the sample data)
t formula uses the corresponding sample variance (or standard deviation) when the population value is not known
assumptions of the t-test
The values in the sample must consist of independent observations.
The population sampled must be normal.
estimated standard error
square root of variance over sample size
substitute the estimated standard error in the denominator of the z-score formula
SM symbol
sample variance and sample size determine the size of the standard error
degrees of freedom
the larger the sample size, the more likely it is that the sample mean is close to the population mean
df = n - 1
the number of scores in a sample that are independent and free to vary
t distribution
complete set of t values computed for every possible random sample for a specific sample size or a specific degrees of freedom
approximates the shape of a normal distribution
more likely to be flat and spread out due to smaller populations having less information about the population standard deviation and dependency on random sampling
Cohen's d
mean differences / standard deviation
estimated d
mean difference/ sample standard deviation
M - u / s
percentage of variance
r^2
variability accounted for by the treatment effect / total variability
confidence interval
an interval, or range of values centered around a sample statistic
observation that every sample mean has a corresponding t value defined by the equation
factors affecting the width of a confidence interval
To gain more confidence in your estimate, you must increase the width of the interval
the bigger the sample, the smaller the interval
correlation
statistical technique that is used to measure and describe the relationship between two variables.
no correlation H0
nonzero correlation H1
Direction of the Relationship
positive correlation, the two variables tend to change in the same direction
negative correlation, the two variables tend to go in opposite directions
form of the relationship
strength of consistency of the relationship
perfect correlation - identified by a correlation of 1.00 and indicates a perfectly consistent relationship
envelope - encloses the data and helps you see the overall trends
Pearson correlation
measures the degree and the direction of the linear relationship between two variables
sum of products of deviations, or SP
measure the amount of covariability between two variables
SP = sum of (x-Mx)(Y-My), definitional formula
computational formula = SP = Sum of XY- sum of X and sum of Y / n
Spearman correlation
alternative to pearson
measures the relationship between two variables when both are measured on ordinal scales (ranks)
Spearman is used when the original data are ordinal; that is, when the X and Y values are ranks
used when a researcher wants to measure the degree to which the relationship between X and Y is consistently one directional, independent of the specific form of the relationship
Reasons correlations are used and where
prediction
validity
reliability
theory verification
restricted range
when a correlation is computed from scores that do not represent the full range of possible values
coefficient of determination
measures the proportion of variability in one variable that can be determined from the relationship with the other variable
point - biserial correlation
used to measure the relationship between two variables in situations in which one variable consists of regular, numerical scores, but the second variable has only two values
dichotomous variable or a binomial variable - only two values
phi-coefficient , both variables (X and Y) measured for each individual are dichotomous
ranking tied scores
List the scores in order from smallest to largest. Include tied values in the list.
Assign a rank (first, second, and so on) to each position in the ordered list.
When two (or more) scores are tied, compute the mean of their ranked positions, and assign this mean value as the final rank for each score.
linear equations
y = bX + a
Regression
regression line, straight line for best-fit in data
least-squares soluation is a method of finding distance between line and data points
Chapter 14
Chapter 9