EXPLORATORY DATA ANALYSIS

ASSUMPTION OF PARAMETRIC TESTS

DEFINITION

TEST OF NORMALLITY

OUTLIERS

HOMOGENEITY OF VARIANCE

TRANSFORMATION

examination of data to describe their main features

used to detect outliers

the process of using statistical tools to investigate data sets in order to understand their important characteristics

INTERVAL DATA

NORMALLY DISTRIBUTED

INDEPENDENT VALUES

HOMOGENITY OF VARIANNCE

value from one subject does not influence value of another

data measured at interval

bell shaped; conduct test of normality

Variance should be the same throughout data

Kolmogorov-Smirnov & Shapiro-Wilk tests

Normal probability plot (Q-Q plot)

Histogram & stem and leaf plot

Box-plot

Skewness and kurtosis

value should be zero

skewness measure of symmetry / kurtosis data are peak/flat

1.96 = not normally distributed

p >0.05 = distribution is not significant (normal)

observed value plotted should fall on the line (normal0

extreme, quartile, median

LEVENE'S TEST

VARIANCE RATIO (VR)

any numbers that is more than 15 times the IOR above upper quartile/below the lower quartile

IOR : dif between the upper quartile and lower quartile

values that are widely separated from the rest

POSSIBLE REASONS

misclassified measurement - belongs to population different from the rest of sample was drawn

represents a rare/chance event

measurement invalid

p<0.05 = var are not equal

p>0.05 var are equal

tests if variance in dif groups or the same

Variance ratio= largest variance/smallest variance

if VR < 2, homogenity can be assumed.

compare 2/ more groups

data not normally distributed

data should be transformed using specific fx