Please enable JavaScript.
Coggle requires JavaScript to display documents.
Statistical Learning - Coggle Diagram
Statistical Learning
What
Y = f(X)+e
Supervised
Predict Response Dependent measurement
Classification
Regression
Unsupervised
Clustering
Semi-Supervised
Small amount of labeled data with a large amount of unlabeled data
Why
Prediction
Y^=f^(X)
f^ = black box :check:
Inference
What is the relationship between the response and each predictor?
Which predictors are associated with the response?
Can the relationship be adequately summarized using a linear equation?
Both
Errors
Irreducible
Useful Unmeasured variables
Lowest possible error rate
Bayes Error
Reducible
By Improving the accuracy of f^
How
Parametric
Estimating f=>Estimating parameters
Chosen model NOT matching true f :red_flag:
Overfitting :red_flag:
Non-Parametric
estimate of f that gets as close to the
data points as possible without being
too rough or wiggly
Needs LOTs of observations :red_flag:
Trade-offs
Flexibility vs. Interpretability
Bias-Variance
High Variance
=> Small data changes-> Large result changes =>
Overfitting
Add noise
Feature selection
Increase training set
L2 (ridge) or L1 (lasso)
regularization
; L1 drops weights, L2 no
Regularization = process of adding tuning parameter (penalty term) to a model to induce smoothness to prevent overfitting
Use
cross-validation
Boosting and bagging
Dropout
technique
High Bias => Underfitting
More epochs of training
Add Features
Variance = amount by which ˆf would change if we
estimated it using a different training data set
Goal: LOW bias, LOW variance