Chapters 17 and 18

Chapter 17

Chapter 18

decision tree algorithm

presentations to management should never be about explaining how algorithms work, but rather, about their performance characteristics

overall optimization measure (LogLoss)

ROC curve

does not constitute a reasonable measure of how far from the target predictions are at the level of the average case

more understandable manner: such as Fraction of Variance Explained (FVE) Binomial

The many types of tree-based algorithms are often combinations of hundreds or thousands of decision trees

steps

Find the most predictive feature and place it at the root of the tree

split the feature into two groups at the point of the feature where the two groups are as homogenous as possible

Repeat step 2 for each new branch (box)

receiver operating characteristics

accuracy

(TP+TN)/all cases

precision

TP/(TP+FP)

negative predictive value

TN/(TN+FP)

AUC: model quality

model comparison

overall best model (the ENET Blender, M101)

best non-blender model (XGBoost model M63)

selecting a model

4.Familiarity with model

5.Insights

3.Speed to build model

2.Prediction speed

1.Predictive accuracy