Please enable JavaScript.

Coggle requires JavaScript to display documents.

Chapter 5: Overfitting and Its Avoidance (Overfitting (tendency of (data…

- - - - to tailor models
        
        to training data
        
        all models do this
        
        to some extent
  - - - model complexity
      - possible overfitting
  - - - overfitting
  - - - overfitting
        
        before trying to
        
        eliminate it
- - - - model accuracy
        
        as a function
        
        of complexity
  - - - assess model accuracy
        
        for unseen cases
  - - - predicted values
      - hidden true values
  - - - hidden from model
        
        to test model accuracy
    - - test set
- - - - by splitting data
        
        too many times
        
        leaving leaves with
        
        pure definitions
    - - because model will
        
        arrive at some classification
    - - in number of nodes
    - - point in tree model where
        
        overfitting begins
  - - - by
        
        adding more variables
        
        More complex
        
        adding more attributes
        
        More complex
        
        often have to prune attributes
        
        to reduce overfitting
        
        Increasing dimensionality
        
        More complex
- - - - model keeps growing
        
        until nodes
        
        contain pure definitions
    - - stop growing tree
        
        before it's too complex
      - grow tree to too large
        
        trim back unnecessary attributes
  - - - can estimate
        
        generalization performance
        
        by splitting training set
        
        validation set
        
        for final testing
        
        sub-training set
        
        to compare model to
    - - analyzes
        
        performance among
        
        various datasets
      - identifies
        
        mean
        
        variance
        
        critical for assessing
        
        confidence in performance