Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 21,22,23 (Setting up a prediction system simply means that at the…
Chapter 21,22,23
Setting up a prediction system simply means that at the decision point where data is available
Sometimes you need to retrain your model. Sometimes your best model wont actually prove to be your best when you run it against the holdout data. This is because everything changes subtly overtime.
Once we have looked at everything. We can try to be more predictive by using 100% off our data (this is something we havent done yet).
21.2 choose deployment strategy
Datarobot has many different deployment strategieis, we need to decide the best one.
Data Robot Prime
creates an approximation of your selected model available as code in the Python and Java programming languages, and its
availability to you is based on your DataRobot account type.
Batch
uses the DataRobot API to upload and score multiple large files in parallel. The code required to use batch scoring is available at
Application Programming Interface
An Application Programming Interface (API) is created on the DataRobot server allowing a programmer to write a program that uploads data for a new patient to this API and receive back a probability that the patient will be readmitted.
In place with spark
Drag and drop
Slow approach to scoring, you just drag a file in.
Chapter 22) Documenting Modeling Process for Reproducibility
Proper documentation is important for others to understand what was done to accomplish the results and the justification for the project to exist
Documenting the modeling process is where projects most often fail.
Consider model documentation your opportunity to do more of the work you love
by attending to the details of articulating the process while a project is fresh in your
mind.
Making careful notes about the business purpose served by the model will save
future time
You want to document the exact queries used to access the data as well as the exact code used to transform the data into a machine learning accessible dataset.
Finally, record the business rules for use of the model and the probability thresholds
23) Creating model and monoriting and maintanence plan
we need to create a monitoring and
maintenance plan which tells others what to do in the event of changes in the environment that hurt the effectiveness of the model
DataRobot will fail rather than attempt to make the best of
the available data.
To avoid the model failure, it is sometimes a good idea to rerun the model as soon
as sufficient new data is available
one approach for detecting declining performance is
through evaluating the training data against new data.