EXPLORATORY DATA ANALYSIS
ASSUMPTION OF PARAMETRIC TESTS
DEFINITION
TEST OF NORMALLITY
OUTLIERS
HOMOGENEITY OF VARIANCE
TRANSFORMATION
examination of data to describe their main features
used to detect outliers
the process of using statistical tools to investigate data sets in order to understand their important characteristics
INTERVAL DATA
NORMALLY DISTRIBUTED
INDEPENDENT VALUES
HOMOGENITY OF VARIANNCE
value from one subject does not influence value of another
data measured at interval
bell shaped; conduct test of normality
Variance should be the same throughout data
Kolmogorov-Smirnov & Shapiro-Wilk tests
Normal probability plot (Q-Q plot)
Histogram & stem and leaf plot
Box-plot
Skewness and kurtosis
value should be zero
skewness measure of symmetry / kurtosis data are peak/flat
1.96 = not normally distributed
p >0.05 = distribution is not significant (normal)
observed value plotted should fall on the line (normal0
extreme, quartile, median
LEVENE'S TEST
VARIANCE RATIO (VR)
any numbers that is more than 15 times the IOR above upper quartile/below the lower quartile
IOR : dif between the upper quartile and lower quartile
values that are widely separated from the rest
POSSIBLE REASONS
misclassified measurement - belongs to population different from the rest of sample was drawn
represents a rare/chance event
measurement invalid
p<0.05 = var are not equal
p>0.05 var are equal
tests if variance in dif groups or the same
Variance ratio= largest variance/smallest variance
if VR < 2, homogenity can be assumed.
compare 2/ more groups
data not normally distributed
data should be transformed using specific fx