Ch.20 communicate model insights (fine types of information that should be…
Ch.20 communicate model insights
fine types of information that should be communicated during a presentation:
model quality metrics (confusion matrix)
areas where a model struggles (potential for improvement through more data -- features & cases)
most predictive features for model building
feature types especially interesting to management (e.g., insights into the business problem and unknown uncovered during the modeling process)
recommended business actions (i.e., to implement model or not, any business decisions to implement at various probability thresholds, and how will doing so change practice?)
we cofus on what information should be shared during a machine-learning presentation to provide information that is clear, understandable by your audience, and that will indicate next steps for the business problem.
first we want to check that we haven't made any mistakes in the model creation process by releasing the holdout data.
after unblocked the holdout sample, click on the holdout column to re-sort the leaderboard by those scores.
then observe the ranking differences between the cross validation and holdout scores
the best outcome would be one in which the order of models did not change between the two sorts.
the second-best outcome would be when your top models stay on top
business problem first
We started Section II (Define Project Objectives) with a business problem. This problem should have guided your work continually. The problem also should have been refined, as the analytics (discussed in Section IV [Model Data]) should have taught us more about the model and our features.
pre-processing and model quality metrics
understand the model performance characteristics
positive predict value (accuracy)
areas where model struggles
we are arguing for a pilot project, because throughout the
discussion of this dataset, we have addressed that, quite simply, it is not a very predictive dataset
We have come to trust DataRobot’s ability to find predictive
value, and none of our tests, including use of cross validation and holdout samples, suggest any problems with the algorithms.
We discussed two main types of data:
Internal data. In this case, we could argue that more data on past visits should be collected for patients that are frequent visitors
External data. While it is not clear that mining public external data will be a worthwhile exercise for this project, it is likely that this project could benefit from purchased data on grocery store data, financial data, distance to hospital, and so on.
most predictive features
Given that our focus is on predicting patient readmissions, discussion will focus on the red bars.
Be ready to develop a story around these findings based on the extensive examination we have done of these features.
not all features are created equally
Often information that helps a model make high-quality decisions is not helpful to change practice
There are at last four kinds of features to consider before going into a management presentation:
feature that need to be changed and therefore requiring a re-run of the models
features requiring further examination
recommended business actions
the final part of the presentation, should be contain explicit recommendations for the next step.