Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapters 21-23 (21 (Choose Deployment Strategy (Batch (uses the DataRobot…
Chapters 21-23
21
-
pick top model, increase sample to 100%, run with new sample
-
22
model documentation
-
-
Work under the assumption that the project will need to be revisited within a year, and think through what information would be helpful to have immediate access to at that time
where did the data come from? How was the data processed? What parameters or selections were used when creating and selecting the model? How is the model used within the business?
using LogLoss as the optimization criterion, using the autopilot to build the models, selecting the best model based on cross validation, and running that model with 100% of the data. This process has been simplified to illustrate the important point that every step leading to model implementation needs to be detailed.
23
potential problems
Related to documentation of the newly installed system, it is also necessary to create a monitoring and maintenance plan. This serves to informs others what to do in the event of changes in the environment that stand to impact the effectiveness of the model.
-
To avoid model failure, it is a good idea to rerun the model as soon as sufficient new data is available
One approach for detecting declining performance is through evaluating the training data against new data
source of data as target
The methodology here is to create a new target that specifies whether a case was used to create the original model or whether that case was retrieved from the production system after the model was used for prediction.
recommended to use either the same measure used for model selection, or the Matthews Correlation Coefficient. It is also recommended to automate threshold testing and set up message and alert transmissions to signal when the determined threshold is reached.