Evaluate Model Performance &
Comparing Model Pairs

Understanding and learning

Without understanding, no presenting

Fraction of explained variance

how the data set has been explained

R^2

Sample Algorithm and Model

general understanding of algorithms/models

decision tree modifier

simple and effective

gini

Find the most predictive factor and place it at the root

number inpatient visits

Split feature into two homogenous groups as possible

pinpatient = .5

repeat step 2 for each new branch (box)

ROC Curve

single most important screen

receiver operating characteristics

Validation/Cross validation

Prediction distribution

Confusion Matrix

true positive

light is green, ML predicted green

true negative

light is red, ML predicted red

= accuracy

= accuracy

tp+tn/all cases

tp+tn/all cases

false negative

It was red, ML predicted green

False negative

it was green, mL predicted red

this is more dangerous than the others

positive predictive value: two right most quadrants

ppv = precision

TP/TP+FP

ratio of true positives amongst all the positives

Precision is 48%

48% of the time, model is correct

true positivity rate

tpr = sensitivity

TP/TP+FN

positive predictive value

true positive rate

harmonic mean

2(TP)/(2TP+FP+FN)

negative predictive value

TN/TN+FP

specificity

false positive rate FPR

FP/FP+TN

threshold is set so anything above it is predicted positive

AUC = area under the curve

100 bixes

ideal is blue and orange lines overlapping

enable drill down

cross validation prediction

search replace T/F to 1/0

Comparing Model Pairs

model comparison

change the number of bins

Blended/unblended models

true positive rate = 0

false positive rate = 0

ROC chart extends from top to bottom

dual lift

Compute data for this model

e net bender is closer to the 14 bin truth

prioritizing model criteria

predictive accuracy

prediction speed

speed to build model

Familiarity with model

Insight

comes at a cost of the other four

training the model

How long can they really lat