Please enable JavaScript.
Coggle requires JavaScript to display documents.
Hypothesis Testing and Linear Regression Chap 5 (Jargon (Statistical…
Hypothesis Testing and Linear Regression
Chap 5
descriptions
explanations for possible trends and recommendations
Done after "data understanding", "data prep" and gathered descriptive info using "EDA".
Jargon
Population - collection of all elements in a particular study
Parameter - Characteristic of a population - cause of unknown nature - usually reped by greek letters
sample - subset of population
Statistic - characteristic of sample
point estimate
- using a statistic to estimate its corresponding parameter
Statistical inference
consists of methods for estimating and testing hypotheses about population characteristics based on the information contained in the sample
How confident are we in our estimates
sampling errors
will happen - diff b/w parameter and statistic - always take absolute
confidence interval
point estimate +- margin of error
margin of error = t(alpha/2) (sd / root n)
t = 1.96 = 95% - use smaller t for higher precision
reducing margin of error
effected by t - confidence lvl and sample size
SD - cannot be changed
n - sample size
so basically inc sample size or take sample with less variability (or reduce conf lvl)
sd / root n = standard error - small when sample large or variability small
Hypothesis testing
refer to stats notes
Confidence interval lvl for proportions
refer to stats notes
Other stats stuff can be used here too
like hypothesis testing for prop
using conf intervals for hypothesis testing
Types of errors
Type 1 - Not guilty but still convicted
Type 2 - Guilty but free
When dividing into train and test check
Sample means
Sds
sizes