Please enable JavaScript.
Coggle requires JavaScript to display documents.
Reading 07: Statistical Concepts and Market Returns (Measure of Central…
Reading 07: Statistical Concepts and Market Returns
Basic terms
Descriptive vs. Inferential Statistics
Descriptive Statistics
: describes data set
Inferential statistics
: uses samples to make forecasts about the population.
Population vs. Sample
Population
: includes all members of a specified group
Sample
: Subset of population, are used to draw inferences about the population
Measurement scales
N O I R
Nominal scale
: Only NAME makes sense
Ordinal Scale
: ORDER also makes sense
Interval Scale
: INTERVALS also makes sense
Ratio Scale
: RATIOS makes senses. Has absolute value
Parameter vs. Sample statistics
Parameter
: describe a characteristic of a population.
Sample Statistic
: describe a characteristic of a sample.
Distribution
Frequency Distribution
How to create Frequency Distribution?
Step 1: Define the intervals.
Step 2: Tally the observations.
Step 3: Count the observations.
is a tabular presentation that summarizes data by assigning it to specified intervals
Relative vs. Cumulative Frequencies
Relative Frequency
: % of total observation in each interval.
Cumulative Frequency
: shows % of observations that is less than the upper bound of each interval
Histogram
Horizontal axis shows intevals
Vertical axis shows frequency
Polygon
: A line connects all midpoints (middle value of each interval ; frequency of such inteval)
Measure of Central Tendency
Arithmetic mean
Population mean: \( \mu =\frac{\sum_{i=1}^{n}X_{i}}{N} \)
Sample mean: \( \overline{X}=\frac{\sum_{i=1}^{n}X_{i}}{n} \)
Good for forecasting future single period return
Geometric mean
Used to find compound growth rate:
\( G=\sqrt[n]{X_{1}\times X_{2}\times ... \times X_{n}} \)
Good for forecasting compound return over multiple period.
Weighted mean
\( \overline{X}_{W}=\sum_{i=1}^{n}W_{i}\times X_{i} \)
weights each value according to its influence.
Harmonic mean
Used to find average purchase price
\( \overline{X}_{H}=\frac{N}{\sum_{i=1}^{N}\frac{1}{X_{i}}} \)
Total Money paid divided by Total of shares bought
Median
: Midpoint of a dataset when arranged from largest to smallest
Mode
: Value that occurs the most frequently in a dataset
The greater the difference between number, the more Arithmetic > Geometric > Harmonic
Quantile
Definition
: a quantile is where a sample is divided into equal-sized, adjacent, subgroups
Example
Quartiles
: distribution is divided into quarters
Quintiles
: Distribution is divided into fifths
Deciles
: Distribution is divided into tenths
Percentiles
: Distribution is divided into hundredths
POSITION of the observation
: \(L=\frac{\left(n+1\right)\times y}{100} \)
LINEAR INTERPOLATION
\(X^{th}\ Observation\ Value\ +\ Location\ Value\ behind\ decimal\ point\ \times[{{(X+1)}^{th}\ Observation\ Value\ -X}^{th}\ Observation\ Value]\)
When the desired location is between 2 observation X1 and X2, we need to find the exact value of observation which reflect that location.
Dispersion
Measure of risk
Range
: Max Value - Min Value
Mean Absolute Deviation
(MAD)
is the average of the absolute value of deviations from Mean
\(MAD=\frac{\sum_{i=1}^{n}\left | X_{i}-\overline{X} \right |}{n} \)
Higher MAD means greater risk
Variance
Definition
: Mean of squared deviation from the Arithmetic Mean
Population Variance
:
\(\sigma^{2}=\frac{\sum_{i=1}^{N}\left (X_{i}-\overline{X} \right )}{N} \)
Sample Variance
:
\(S^{2}=\frac{\sum_{i=1}^{n}\left (X_{i}-\overline{X} \right )}{n-1} \)
Standard Deviation
: is the positive square root of Variance
Chebyshev's inequality
Definition
: Proportion of the observation within k standard deviation (SD) of the mean is \( \geq 1-\frac{1}{k^{2}} \) for all k >1
Specifically
\( \pm \)1.25 SD
: 36% of observations lie within
\( \pm \)1.5 SD
: 56% of observations lie within
\( \pm \)2 SD
: 75% of observations lie within
\( \pm \)3 SD
: 89% of observations lie within
\( \pm \)4 SD
: 94% of observations lie within
Sharpe Ratio
Coefficient of Variation
: \( CV=\frac{s}{\overline{X}} \)
Lower CV is better, less risk per unit of return
Sharp Ratio
: measures excess return per unit of risk
\( Ratio_{Sharpe}=\frac{\overline{r}_{p}-\overline{r}_{f}}{\sigma _{p}} \)
Skewness & Kurtosis
Skewness
Definition
: Describes the degree to which a distribution is not symmetric about its mean. If Absolute value of skew > 0.5, it is considered significantly different from 0
Right Skew
: Positive skewness.
Mean > Median > Mode
Left Skew
: Negative skewness
Mean < Median < Mode
Formula
: \(Sample\ skewness\ =\ \frac{\sum_{i=1}^{N}\frac{{(X_i-\bar{X})}^3}{N}}{s^3}\)
Kurtosis
Definition
: Measure the peakedness of a distribution & the probability of extreme outcomes (Thickness of tails)
Kurtosis = 3 a.k.a Mesokurtic
: Normal Distribution
Kurtosis >3 a.k.a Leptokurtic
(Núi Trẻ)
Higher likelihood of extreme value as compared to a normal distribution.
However, large fluctuations are more likely
For Risk-seeking investor (higher chance of extreme gain)
Kurtosis < 3 a.k.a Platykurtic
(Núi Già)
negative excess kurtosis
lower likelihood of extreme value as compared to a normal distribution.
For Risk-averse investors (less chance of extreme loss)
Sample kurtosis formula
: \(Sample\ kurtosis\ =\ \frac{\sum_{i=1}^{N}\frac{\left(X_i-\bar{X}\right)^4}{N}}{s^4} \)