Please enable JavaScript.

Coggle requires JavaScript to display documents.

Chapter 5 (overfitting (Table model (Memorizes training data no…

- - - - In practice predicts no churn
- - - - Shows accuracy on training and holdout when complexity changes
        
        Base rate
  - - - Acts as lab test
        
        Known as test set
- - - - splits data into folds
- - - - Solution
        
        limit tree size
        
        limit instances present in leaf
        
        Make tree large then trim
        
        If changing branch with leave doesnt change accuracy then trim
  - - - split training into training and testing
        
        Testing
        
        Validation set
        
        Training
        
        Subtraining set
        
        Nested holdout testing
    - - nested holdout precedure
        
        picks best feature
        
        then best pair and so on
        
        process stops when new feature doesnt improve accuracy
    - - works like SFS but in reverse
- - - - Leads to overfitting
        
        could generalize
- - - - Every model can be overfitted
      - training data is a portion of population
      - No way to tell if model is overfit
        
        use holding data
- - - - some models don't work in real as in lab
        
        training and deployment populations are different