Please enable JavaScript.
Coggle requires JavaScript to display documents.
STATS I - Coggle Diagram
STATS I
-
NOMINAL ORDINAL
-
ORDINAL
CATEGORICAL
DEFINITION: clear order of values, however the spacing between the values is not the same (completely agree, partially agree...)
-
-
-
CATEGORICAL (both): -CHI-SQUARE it is used to find the statistical significance
- doesnt work with percentages
- (as in t-distribution): p value che confronti col critical (alpha) ...
- Chi-square is bivariate
assumptions: independent observation, expected frequency must be > 5
- when the x2 increases, the result is more likely to be significant
- if the expected frequencies are too small, we merge categories OR Fisher's test (2X2) OR likelihood ratio
-
ordinal: Gamma (dichotomus nominal variable) -1, +1 (-1: each pair of data is in disagreement vs +1 is the opposite)
DISTRIBUTIONS
NORMAL DISTRIBUTION: symmetrical distribution with mean of 0 and SD of 1
α= 90% --> 1.645
α= 95% --> 1.96
α= 99% --> 2.576
SAMPLING DISTRIBUTION: distribution of the sample means
- distribution of an infinite amount of samples and plotting the mean
- it is abstract and theoretical
-
WEEK 5 interval ratio variables COVARIANCE (Pearson's correlation) means the change in one variable, creates the change in the other.
0 positive association between variables, both variables increase or decrease together
- = 0 no association between variables
- <o = negative association between variables, if one variable increases, the other decreases (and vice versa)
-standardized between -1 and +1
- Pearson's follows its distribution (non normal)
- to determine confidence interval use bootstrapping
when pearson=s r is bigger than the critical values, we reject, (the opposite of p value)
when n<30, we msut ensure than the population vaariables are normally distributed, with a pp plot, they must be in a straight line facing upwards.
CONTINOUS VARIABLES
when we have ordinal scales we treat them as continous when they have more than 10 categories
SPEARMAN RHO, measures the strenght and association between 2 variables, used for discrete ordinal variables
CAN ONLY BE USED WITH DISCRETE, MEANING THAT THE NUMBERS MUST BE WHOLE NUMBERS
CAN BE USED WHEN PEARSON ASSUMPTIONS ARE VIOLATED
significant relationship, we can substitute rho to pearson r, and use the same formula for pearson r
TAU B (SPSS ONLY) when we hav many tight ranks, or small sample size -tight ranks occur when obsevrations have the same value, impossible to assign unique rank numbers
usually lower than pearson r and spearman s.
correlation does not mean causality, look at 4 reasons on notes
WEEK 1
inter-quartile range: range of the middle 50% of the data (dividi il dataset in due parti uguali e prendi la median del primo e la median del secondo e poi fai median2 - median1
- z-score: valore che si usa per testare critical values
- large vs small: servono a calcolare le percentuali di quanti dati sono a destra o a sinistra della tua linea; la linea vine tracciata dallo z-score
- si puo calcolare la percentage tra due valori:
. se sono entrambi dallo stesso lato: smaller portion di entrambi and the biggest-the smaller
. se invece comprendono il numero della media: larger - smaller OPPURE 1- smaller portion1 - smaller portion 2
WEEK 2
error: the part of the outcome that our model cannot explain
- outcome= model+ error
- variables (measured constructs) vs parameters (hypothetical)
- if the result is below the mean --> the model overestimates vs above --> underestimates
- how much error is there? tot deviance (= tot error) = xj - mean
- deviance vs error
- confidence intervals: we can estimate a range of values which is likely to include the unknown population parameter
- Central limit theorem: the sampling distribution of sample means is approximately normally distributed --> this even applies when the population is not normally distributed; <30 samples does not work
DO NOT CONFUSE
- standard error of the population
- standard deviation of the sample
- standard deviation of the sampling distribution
WEEK 3
- null hypothesis significance test: it is statement of no difference, no association, or no treatment effect (Ho); si regetta quando il p value < α value
- alternative hypothesis: statement of htere is a difference, association or treatment effect (H1)
- Fisher: tests of significance
- Neyman, Pearson: tests of acceptance
- a test statistic is a statistic for which we know how frequently different values occur; compares the probability of the data witg what is expected under the null-hypothesis
TYPE I of error:
- incorrect rejection of a true null hypothesis
TYPE II of error:
- incorrect acceptance of a wrong null hypothesis; more likely to happen: small samples and small effects
- problems with null hypothesis stats. test: smaller p value means a stronger effect/difference -- wrong; statistical significance is synonymous with theoretical or practical significance -- wrong; a non-significant effort means that the null hypothesis is true -- wrong
-