Please enable JavaScript.
Coggle requires JavaScript to display documents.
Introduction to t statistic, Correlation and regression - Coggle Diagram
Introduction to t statistic
Used when there isn't enough information to use z-scores. Usually population variance is unknown.
Difference between z-score and t: z score uses actual population variance and t score uses sample variance when population variance is unknown.
A t distribution is the complete set of t values computed for every possible random sample for a specific sample size (n) or a specific degrees of freedom (df). The t distribution approximates the shape of a normal distribution.
To use a t test The values in the sample must consist of independent observations and The population sampled must be normal.
t distribution table is like unit normal table and is used to determine proportions for t distributions
Larger estimated standard error= smaller t score
Degrees of freedom: df=n-1
describe the number of scores in a sample that are independent and free to vary.
law of large numbers
The larger the sample variance the closer it is to population variance making the t stat more accurate
Large sample size= large t score
More likely to find significant results
The larger the df the closer a t distribution will be to a normal distribution
Hypothesis testing with t stat
Use population mean since null hypothesis would indicate that the mean didn't change
estimates cohen's d
Used to find effect size for t stat
alternative method to finding effect size if finding percentage of variance
r^2=.01 small effect size
r^2=.09 medium effect size
r^2=.25 large effect size
A confidence interval is an interval, or range of values centered around a sample statistic.
Correlation and regression
Correlation is a statistical technique that is used to measure and describe the relationship between two variables.
Relationship characteristics
sign of the correlation (+ or -) describes the direction
Positive: as x increases or decreases, y also increases or decreases
same direction
Negative: as x increases y decreases or vice versa
opposite directions
Form: tends to have linear form
Consistency: 1.00 or -1.00 is a perfect relationship, which means when X increases or decreases by a specific amount y does too. If correlation is 0 then there is no consistency of the relationship
The Pearson correlation measures the degree and the direction of the linear relationship between two variables.
r=coverability of x and y/ variability of x and y separately
sum of products of deviations, or SP
Phi coefficient: Convert each of the dichotomous variables to numerical values by assigning a 0 to one category and a 1 to the other category for each of the variables.
Use the regular Pearson formula with the converted scores.
Square r to find actual strength of relationship
Coefficient of determination
Can be found with z scores
sample:
population:
Correlation does not equal causation
Outliers can skew numbers, look at scatter plots
Hypothesis testing for pearson correlation, measures correlation. Null= no correlation alternative= real correlation
Directional: null= correlation is not positive alternative= correlation is positive
t statistic for pearson
Alternatives to Pearson correlation
First, the Spearman correlation is used to measure the relationship between X and Y when both variables are measured on ordinal scales. In addition to measuring relationships for ordinal data, the Spearman correlation can be used as a valuable alternative to the Pearson correlation, even when the original raw scores are on an interval or a ratio scale.
Used when relationship is not linear
Linear equations:
The statistical technique for finding the best-fitting straight line for a set of data is called regression, and the resulting straight line is called the regression line.
The standard error of estimate gives a measure of the standard distance between the predicted Y values on the regression line and the actual Y values in the data.