Evaluate Model Performance (ROC Curve (The Receiver Operating…
Evaluate Model Performance
One must understand the models and how they run
A Sample Algorithm and Model
Decision Tree Classifier
tree-based algorithms group was by far
"Arguably, if a budding data scientist were to learn only
one algorithm, the decision tree would be the one due to both its conceptual
simplicity and its effectiveness."
Not expected to be the most effective/a negative indicator
decision tree steps
Find the most predictive feature (the one that best explains the
target) and place it at the root of the tree.
Split the feature into two groups at the point of the feature where
the two groups are as homogenous as possible
Repeat step 2 for each new branch (box).
The Receiver Operating Characteristics
DataRobot has determined a threshold that maximizes a common measure
changes to a presentation of how many cases ended up at each probability
positive predictive value (PPV)
the two rightmost quadrants and is more often called
True Positive Rate (TPR)
harmonic mean of Positive Predictive Value and True
Using the Lift Chart for Business Decisions
sorts all validation cases by their probability of
the cases are split into 10% bins,
that is, bins of 800 cases
Comparing Model Pairs
Select Model Comparison, click change model, select lift,
Prioritizing Modeling Criteria and Selecting a Model
5 criteria to model selection
Speed to build model.
Familiarity with model.