Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 19 interpret model (19.5 the partial impact of features on target…
Chapter 19 interpret model
19.1 features impacts on target
one of the advantages of supervised machine learning is that all relationships are measured in terms of their relationship with the target
four types of relationships
the overall impact of a feature without consideration of the impact of other features
the overall impact of a feature the impact of other features
the directional impact of a feature
the partial impact of a feature
19.2 the overall impact of features on the target without consideration of other features
treats each feature as a standalone effect on the target
importance score
useful because it allows a data scientist to focus on the features most likely to yield additional predictive value if misinterpreted by the auto ml
19.3 overall impact of a feature adjusted for the impact of other features
use top model
best results come from the most accurate model
want a low validation score for best model
19.4 the directional impact of features on target
whether the presence of a value helps the model by assisting it in predicting readmissions or non-readmissions
insights
variable effects, this shows the logistical regression
gives you more things to better understand the data
this does provide what are commonly known as coefficients for the most important feature characteristics that drive a prediction decision
red bars indicate feature values that make positive cases more likely
blue bars indicate feature values that make negative cases more likely
19.5 the partial impact of features on target
model x ray constructs a list of features ranked by their influence on the target as denoted by the size of the green line under each feature name in the left pane
y axis contains the frequency of cases in the validation set
x axis contains the values of the most predictive feature
averages the predictions of the probabilities
19.6 power of language
shows us the most important terms leading to readmissions were valves and renal
subject matter expert
important for evaluating models
may need to examine the diagnosis codes that contain the word valve
word cloud
represents the words that have the highest coefficients
intensity of red or blue indicates the size of their coefficient
19.7 hotspots
rule fit classifier
immediate red flag
re run this with full data like always
shows the most relevant combinations of features and their effect on the target
19.8 reason codes
understanding the data at the more granular individual patient level remains limited
refers to the validation data set
left and right thresholds allow for the specification of the probability cutoffs to be used in this view
left blue states here that reason codes are desired for all patients with probabilities of readmission ranging between 0 .22
right red specifies that reason codes are also desired for any patient with probability between .585 1
bottom preview section of the screen shows the top three cases and bottom three