Please enable JavaScript.

Coggle requires JavaScript to display documents.

Ch. 5 Overfitting (overfitting- tendency of data mining procedures to…

- - - - test set, difference between model's general accuracy and accuracy on training set
  - - - as model gets more complex, it's allowed to pick up more harmful spurious correlations
        
        provide incorrect generalizations
  - - - k= 5 or 10
      - compare fold accuracies between logistic regression and classification trees
- - - - simplest method to limit tree size is specify a minimum number of instances that must be present in the leaf