Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 5: Overfitting and its Avoidance (Overfitting (Overfitting…
Chapter 5: Overfitting and its Avoidance
Generalization
Table Model
Memorizes training data
generalizing
model applies to data that were not used to build the model
Not representative off the population
Overfitting
Overfitting
tailoring models to the training data
Expense of generalization
hold out data
withholding data from model
Generalization Performance
Estimated by comparing predicted values with true values
Accuracy depends on complexity
Why is it bad?
Performance degrades
increasing complexity picks up harmful correlations
Incorrect generalizations
Hold out evaluation to cross-validation
cross validation
more sophisticated hold out training
simple estimate of the generalization performance
Better use of limited data set
Learning Curves
plot of the generalization performance against the amount of training data
against the amount of training data
Avoidance and Complexity control
Limit size
prune
removal or replacement
generalizing too much
General method
overfitting
strictly independent
balance of complexity and accuracy