Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 19: Interpret Model (Hotspots (A visualization that shows the most…
Chapter 19: Interpret Model
Feature Impacts on Target
4 relationships used for exploring why a model predicts certain outcomes
Overall impact of feature w/o consideration of the impact of other features
Treats each feature as a standalone effect on the target
Each feature is individually examined in a logistic regression model against the target
Allows data scientists to focus attention on the features most likely to yield additional predictive value if misinterpreted by the AutoML
These scores are not fully reliable indicates of the value of a feature and should not be relied on for feature selection and model interpretation
Overall impact of a feature adjusted for the impact of other features
Calculates a value of each feature in the context of the model
Examines the performance relative to the model that retained all the features
Most important feature is scaled to a score of 100% and the others are scaled relative to it
Usain Bolt example
Creating models with fewer features avoids overfitting and reduces problems due to changes in the databases and sources of data
Directional impact of a feature
Whether the presence of a value helps the model by assisting it in predictions
Provides a greater understanding of the data and its context
Examine the LogLoss score after model has finished running
Provides coefficients for the most important feature characteristics that drive a prediction decision
Partial impact of a feature
Constructs a list of features ranked by their influence on the target as denoted by the size of the green line under each feature name
Focus on features showing large differences between predicted and actual
A strong result is when the locations of the yellow dot values on the rightmost Y-axis change significantly
The Power of Language
Subject Matter Expert (SME)
Important during evaluation of text models
Word clouds represent words that have the highest coefficents
Can filter stop words- a common word generally assumed to have little value (ex: "of")
Hotspots
A visualization that shows the most relevant (up to 4) combinations of features and their effect on the target
The largest and most overlapping hotspots are organized in the middle
The deeper the tone, the more of an impact that particular combination of features has on the target
Mean Relative target
Can do both hotspots & coldspots
Not recommended to use during presentations because of its high level of detail and complexity
Reason Codes
A powerful feature that can supplement business decisions (ex: predicting turnover)
Engage in additional evaluations of why a prediction was set as the given probability for that case
An analysis of values and their probability