Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 19: Interpret Model (Power of language (Insights - textminning…
Chapter 19: Interpret Model
Feature Target relationships
Overall Impact of feature without consideration of the impact of other features
stand-alone effect on the target
logistic regression model
full green bar means feature explains at least 30%
hover over to see optimization bar
Allows data scientist to focus attention on features most likely to yield additional predictive value if misinterpreted.
overlapping information distorts measuremnt
Do not rely on for feature selection and model interpretation
Over all impact of a feture adjusted for the impact of other fetures
Most important feature scaled to 100%
watch out for text features!
Give list a new name and select how many top fetures
Creating models with fewer features is a good idea to avoid over fitting
Directional impact of a feture
whether the presence of a value helps the model by assisting it in predicting readmission or non-readmission
Insights to Variable effects
Logistic regression analysis
Coefficients
Red bar = indicate feture that make positive cases
Blue bars indicate features that make negitive cases
Partial Impact of a feture
Model X-Ray Screen
Lists features by rank of their influence on the target
Displays 100% of green bar
Y-axis contains frequancy of cases
x-axis - values of most important feature
Orange dot is actual score - # of cases averaged TF (right Y-axis)
Blue cross (predicted) - averages the prediction probabiltys
focus on features with large gap between actual and predicted
Partial dependence (effect size)
marginal effect of a value when all other features are constant
Look at change in yellow dots
Log the x-axis to look for relationshiops that increase twoards high values on the chart
Power of language
Insights - textminning
Shows most important terms in predicting
Access to subject matter expert will be important
anyalitics is becoming so easy that it is easier to teach subject matter experts anyalitics than data scietiests the subject matter expertise
Word cloud - represents the words that have the highest coefficents
Intensity of red or blue indicates the size of their coeffecitnets
Filter stop words - common word that has little value
Hotspots
insights to hotspots
uses rule-fit classifier
rerun the model if it stalled at 16%
Shows the most relevant combinations of features and their effect on target
the mean relative target is the result of dividing the target value of this hotspot group by the average number of readmits and account for how predictive the hotsspot is
reason codes
anyalasis of the feature values for a specific case determaind the probablity of target
threshold number of cases want to examine
importance of each feature is ranked strong, medium, or week
help suplement business decisions
used in settings where humans are involved in the decision process