Statistical Techniques Using R - I

Mean, Median, Mode

Mean

The average of a data set

image

Median

The middle value when data is ordered

If n is odd, median is the middle number, if n is even, it is the average of the two middle value

Mode

The Most frequently occurring value in a data set.

Relationship

Empirical relation: Mode = 3 x Median - 2 x Mean

Grouped Data

Mean

image

Where fi is the frequency and xi is the midpoint of the class intervals.

Range & Measure of Dispersion

Range

Range = Max -Min

Standard Deviation (SD)

Measure dispersion from the mean

image

image

Variance

image

image

Inclusive vs Exclusive Classes

Inclusive Class

Upper and lower class boundaries are included.

Exclusive Class

Upper class boundary is excluded.

Conversion Between Them

Inclusive to Exclusive: Subtract a small value (e.g., 0.5) from upper boundary

Exclusive to Inclusive: Add a small (e.g., 0.5) value to upper boundary

Coefficient Of Variation (CV)

Measures the relative variability

image

Graphical Representations

Histogram

A bar graph of frequencies for class intervals.

Bar Graph

Represents categorical data using bars.

Ogive Curve

Cumulative frequency curve.

Less Than & More Than Ogive

Represents cumulative frequency distribution for data less than or more than specific values.

Measure Of Dispersion

Quartile Deviation

Interquartile Range (IQR)

image

image

Skewness

Symmetrical Distribution

Mean = Median = Mode

Positive Skewness

Tail on the right side (Mean > Median > Mode)

Negative Skewness

Tail on the left side (Mean < Median < Mode)

Karl Pearson's Coefficient of Skewness

image

Bowley's Measure of Skewness

image

Kurtosis

Measures the "tailedness" of the distribution

Leptokurtic

Sharp peak, heavier tails

Platykurtic

Flat peak, lighter tails

Mesokurtic

Normal distribution curve

Bivariate Data

Correlation

Positive Correlation

Both variables increases together

Negative Correlation

One variable increases while the other decreases

No Correlation

No relationship between variables

Covariance

image

Coefficient of Correlation

Measures the strength of linear relationship between two variables

image

Linear Regression

Describes the relationship between dependent and independent variables

Regression Equation

image

image