Please enable JavaScript.
Coggle requires JavaScript to display documents.
NEM1002 Statistics for Decision Making - Coggle Diagram
NEM1002 Statistics for Decision Making
S1 Univariate Statistics
Admin
1 hr in class = 2 hours outside class
Every class has documents to download
Statistics
Variables
Univariate = 1 variable
bivariate = 2 variables
Types of data
Numerical
"Quantitative"
Discrete
Finite
whole numbers (integers) or known decimals (not 0.1-.10
Countable
Continuous
infinite
Data is grouped in tables eg. 1-10,11,20
Histogram
Categorical
Qualitiative
Ordinal
Ranked categories
eg preference, fast and slow, etc
Nominal or Non-ordinal
Can't be ranked
Measured in frequency
Bar
Descriptive stats
Spread
Q1
Half of first half (25%)
Mean
Average
Sum of data / number of data points
Median / Q2
Centre number of data
Number of data points + 1 /2
If 2 numbers, take average
Mode
Most occurring data point
Q3
Half of second half (75%)
Quartiles exclude the Median
If median is an average, include the 2 data points average is derived from
Dispersion
Range
Difference between max. and min. value
I.Q.R
Difference between Q1 and Q3, or middle 50%
Inter Quartile Range
Deviation
Difference between data value and mean
Can be + or -, all values sum to 0
Measures spread of data
Standard Deviation
(Square root of) Sum of deviations Squared / number of data points -1
For Population (complete data sets)
Variance
approx. of population range
Standard deviation Squared
S2 Uni/Bivariate Statistics
Excel
Box and whisker plot
Select numbers, Insert > recommended charts > box and whisker plots
Click on + to add axis names
Symbols
use ' key to prevent date formatting
= multiply
/ = divide
$ maintains column/row formatting
eg B$12
Bivariate Stats
Independent variable
Causes dependent variable to change
X
Dependent variable
Y
Causation and Correlation
Correlation
Strength of linear relationships
Pearson's Correlation Coefficient
add photo from powerpoint
Univariate Stats
Distribution
Normal / Bell Shaped curve
Empirical Rule AKA 68/95/99.7
68% of data is within 1 Standard Deviation from mean
95% within 2 S.D. of mean
99.7% within 3 S.D. of mean
Outliers
Upper/lower fence(range) =
Q1 –1.5 x IQR, Q3 + 1.5 x IQR
If outside this range, data is an outlier
S3 Bivariate Statistics
Excel
Residual Analysis
Residual = difference between predicted and real Y values
Make and label columns 'y-predict' and 'residuals' > take equation for Y from Scatter Plot and sub. x value (
=28.679*A2-78.733
where A2 is x) copy and paste down >
Residuals "
=Y-Ypredict
" (eg B2-C2) copy paste down
Highlight residual data > create scatter plot > If parabolic=non-linear
Line of Best Fit / R^2
In scatter graph, right click data point > open 'add trendline' > scroll to bottom > check 'Display Equation' and 'Display R-squared'
AKA Least Squares Regression Line
In a new cell, copy R^2 and enter
"=SQRT(R^2)"
add this to text box in scatter plot
Transformations
Makes R/R^2 stronger and removes parabolic shape in scatter plot
Stretch X/Y
Create new tabs called x^2, x^3, x^4 etc. >copy/paste x/y values in new tab > Create new columns labeled 'x^2' (or relevant power) and 'y'
In 'x^2' input "
=A2^2
" and copy/paste down > copy/paste original y values > Repeat
Line of Best Fit
to determine strength of R/R^2
Repeat with next highest power eg. x^3 until R/R^2 relationship stops increasing > repeat
Residual Analysis
to check for parabolic shape
Compress X/Y
Complete both and use strongest relationship
Log10
Copy x values > label 'y log' column > input "
LOG(y value)
" and copy down. Complete
Line of Best Fit
and
Residual Analysis
adjusting y to Logy in equations and graphs
Reciprocal
Copy x values > label 'reciprocal' column > input "
=1/y value
" and copy down. Complete
Line of Best Fit
and
Residual Analysis
adjusting y to 1/y in equations and graphs
S4 Probability
Experimental
Probability = f / n
f = number of observed favourable outcomes
n = total number of experiments
found by performing 'experiments'
should eventually converge to theoretical prob.
Theoretical
Probability = f / n
f = number of favourable outcomes
n = total number of outcomes
Sets
groups of data displayed with { }
e.g. a set of integers 1-5 = {1,2,3,4,5}
e.g. flipping a coin = {H,T}
ξ = universal set
Events
a set of outcomes
e.g. rolling an even number on a die = {2,4,6}
Cardinal number
describes total size of set, n(X)
e.g. A= {1,3,5,7}, n(A)=4
Complimentary Event
AKA non-favourable outcomes. Occur simultaneously. Denoted by X' (not X)
X + X' always equal 1 or 100% of the outcomes
ξ= {0,1,2, ... ,99,100}. If A represents Even positive integers, What is A’? A’ = Odd positive integers and 0
LATTICE diagrams
Venn diagram
notation
n = intersect, and, in both