Please enable JavaScript.
Coggle requires JavaScript to display documents.
Statistics for the Behavioral Sciences (Introduction to the t statistic…
Statistics for the Behavioral Sciences
Introduction to the t statistic
Sample variance = s^2 = SS / n-1 = SS / df
Sample standard deviation = s = √SS/n-1 = √SS/df
Estimated Standard Error - is used as an estimate of the real standard error σM when the value of σ is unknown. It is computted from the sample variance or sample standard deviation and provides an estimate of the standard distance bteween a sample mean M and the population mean μ.
estimated standard error = sM = S / √n or sM = √s^2 / n
The estimated standard error of M typically is presented and computed using variance
2 reasons for making this shift from standard deviation to variance:
The sample variance is an unbiased statistic and provides an accurate and unbiased estimate of the population variance.
Use variance for all of the different t statistics. Thus, estimated standard error = √sample variance / sample size
t statistic - is used to test hypotheses about an unknown population mean when the value of σ is unknown: t = M - μ / sM
Degrees of freedom - describes the # of scores in a sample that are independent and free to vary: df = n-1
t distribution is the complete set of t values computed for every possible random sample for a specific sample size (n) or a specific degrees of freedom (df). The t distribution approximates the shape of a normal distribution
The greater the sample size (n) is, the larger the degrees of freedom are, and the better the t distribution approximates the normal distribution.
The shape of a t distribution: as df get large, the t distribution gets closer in shape to a normal z-score distribution. The t distribution has more variability than a normal z distribution, therefore the t distribution tends to be flatter and more spread out, whereas the normal z distribution has more of a central peak
t distribution table- the numbers in the table are the values of t that separate the tail from the main body of the distribution. Proportions for one or two tails are listed at the top of the table and the df values are listed in the first column
Hypothesis Test with t Statistic
t = sample mean (from the data) - population mean (hypothesized from null hypothesis) / estimated standard error (computed from the sample data
Step 1: State the Hypothesis and Select an Alpha Level
Step 2: Locate the Critical Region
Step 3: Calculate the t statistic t = M - μ / Sm
Make a Decision regarding the null hypothesis
Two assumptions of the t test:
1- the values in the sample must consist of independent observations
2- The population sampled must be normal
Estimated Cohen's d: mean difference / sample standard deviation - M - μ / s
Percentage of variance accounted for by the treatment - a measure of effect size that determines what portion of the variability in the scores can be accounted for by the treatment effect
Confidence Interval - an interval, or range of values centered around a sample statistic. The logic behind a confidence interval is that a sample statistic, such as a sample mean, should be relatively near to the corresponding population parameter
Correlation - a statistical technique that is used to measure and describe the relationship between two variables
A correlation requires two scores for each individual. (One score from each of the two variables. These scores normally are identified as X and Y
The direction of the relationship
in a positive correlation the two variables tend to change in the same direction: as the value of the X variable increases from one individual to another, the Y variable also tends to increase; when the X variable decreases. the y variable also decreases
In a negative correlation, the two variable tend to go in opposite directions. As the X variable increases, the Y variable decreases. That is, it is an inverse relationship
The form of the relationship
The most common use of correlation is to measure straight-line relationships
Strength or Consistency of the Relationship
The closer the correlation is to + or - 1.00 the stronger the correlation is. A perfect correlation always is identified by a correlation of 1.00 and indicates a perfectly consistent relationship
The Pearson Correlation measure the degree and the direction of the linear relationship between two variables
r = covariability of X and Y / variability of X and Y separately
The sum of products of deviations (SP) - a measure of the degree of covariability betwwen two variables; the degree to which they vary together
Definitional Formula: SP = Σ(X-Mx)(Y-Mx)
Computational Formula: SP = ΣXY - ΣXΣY / n
SS Formula for X variable : SS = ΣX^2 - (ΣX)^2 / n
SS formula for Y variable: SS = ΣY^2 - (ΣY)^2 / n
Pearson r correlation formula: r = SP / √SSxSSy
The value of a correltaion can be affected greatly by the range of scores represented in the data
One or two extreme data points, outliers, can have dramatic effect on the value of a correlation
The value of r^2 is called the coefficient of determination because it measures the proportion of variability in one variable that can be determined from the relationship with the other variable
Partial Correlation measures the relationship between two variables while controlling the influence of a third variable by holding it constant
Hypothesis Test with Pearson Correlation
The null hypothesis can either say there is no population correlation or that that population correlation is not positive or negative, if it is a directional or one tailed test
The alternative hypothesis can say that there is real correlation or that the population correlation is positive or negative, if it is a directional or one tailed test
Standard error for r = sr = √s-r^2 / n-2
t statistic: t = r - p / √(1-r^2) / (n-2)
Degrees of Freedom: df = n - 2
Spearman Correlation: a correlation calculated for ordinal data. Also used to measure the consistency of direction for a relationship
Point- biserial correlation - a correlation between two variable where one of the variables is dichotomous
The phi-coefficient - a correlation between two variables both of which are dichotomous