Please enable JavaScript.
Coggle requires JavaScript to display documents.
Evaluate Model Performance (ROC Curve (The Receiver Operating…
Evaluate Model Performance
One must understand the models and how they run
A Sample Algorithm and Model
Decision Tree Classifier
tree-based algorithms group was by far
the largest
"Arguably, if a budding data scientist were to learn only
one algorithm, the decision tree would be the one due to both its conceptual
simplicity and its effectiveness."
Not expected to be the most effective/a negative indicator
decision tree steps
Find the most predictive feature (the one that best explains the
target) and place it at the root of the tree.
Split the feature into two groups at the point of the feature where
the two groups are as homogenous as possible
Repeat step 2 for each new branch (box).
ROC Curve
The Receiver Operating Characteristics
DataRobot has determined a threshold that maximizes a common measure
“Frequency,”
changes to a presentation of how many cases ended up at each probability
positive predictive value (PPV)
the two rightmost quadrants and is more often called
precision.
TP/(TP+FP)
True Positive Rate (TPR)
(TP)/(TP+FN)
harmonic mean of Positive Predictive Value and True
Positive Rate
2TP/(2TP+FP+FN)
Using the Lift Chart for Business Decisions
lift chart
sorts all validation cases by their probability of
readmission
the cases are split into 10% bins,
that is, bins of 800 cases
Comparing Model Pairs
Model Comparison
Select Model Comparison, click change model, select lift,
Dual Lift
Prioritizing Modeling Criteria and Selecting a Model
5 criteria to model selection
Predictive accuracy
Prediction speed.
Speed to build model.
Familiarity with model.
Insights.