Please enable JavaScript.
Coggle requires JavaScript to display documents.
Week 2 (Describing Numerical Data (Measures of Central Tendency (Mean…
Week 2
Describing Numerical Data
Measures of Central Tendency
Mean
(called average in excell)
Median
(the middle value if all values are arranged from smallest to largest)
Mode
(the most common value)
Measures of Shape
Skewness
is the extent of asymmetry in the distribution. If the distribution is symmetric then it is now skewed.
Negatively
skewed distribution demonstrates mean < median < mode
left tail skewness
Positively
skewed distribution demonstrates mean > median > mode
right tail skewness
If the distribution is
symmetric
then mean = median = mode
Measures of Dispersion
Range
The difference between the maximum and minimum value
Variance
Roughly speaking, the average differences from the mean
Standard Deviation
Square root of the variance. Roughly speaking the standard deviation measures the average amount that values vary above and below the mean
Represented as s^2
Describing Numerical Data with Distributions
Probability Distributions for Numeric Data
Measures of central tendency, shape and dispersion are useful, but sometimes we want a more detailed description of the distribution. This can be achieved by looking at the
frequency distribution of a variable
A more appealing way of presenting this is through a histogram
We can think of the frequency distribution as describing the probability distribution for income across the various ranges defined
To be a proper probability distribution the outcomes/events listed in a probability distribution should have 2 characteristics
Mutually exclusive
no two outcomes in the list can be true at the same time.
e.g. If we get a head, we can’t get a tail! This is true in the example above, as the income ranges/categories do not overlap.
Exhaustive
the list includes all possible outcomes.
e.g. We must get either a head or a tail. There are no other possibilities. This is true in the examples above, because the ranges/categories cover all possible values. This is why the sum of probabilities in a probability distribution always equals one.
Normal Distribution
A probability ditribution that is symmetric round the mean
Can be represented in 2 ways
P1= norm.dist (x1, mean, sd, true)
Tell us the probability given a particular value or point on the distribution
:
X1= norm.inv (p1, mean, sd)
Given the probability, what is the point
Distributions
Can be presented in two ways:
Probability Density Distributions (PDF)
Cumulative Distribution Function (CDF)
Plots what proportion of th distribution falls below a particular value
Values
Degrees of freedom = n-1
Z=normal distribution