Please enable JavaScript.
Coggle requires JavaScript to display documents.
Larsen Chapter 15 (Advanced Options (The only difference between Random…
Larsen Chapter 15
Advanced Options
The only difference between Random and Stratified is that the Stratified option works a bit harder to maintain the same distribution of target values inside the holdout as the other samples (approximately the same percentage of True values inside each sample).
Once the lock-box has been filled with holdout cases and appropriately locked, the remaining cases are split into n folds. The (n)umber of folds can also be set manually
Partition feature
the allocation to train, validation, and holdout samples has been manually specified. three different values in your selected feature are required. If there are more than three unique values exist in the selected feature, only train and validation folds can be selected, and all other values will be assigned to holdout.
Group Approach
allows for the specification of a group membership feature and DataRobot makes decisions about where a case is to be partitioned but always keeps each group together in only one partition
Date/Time
makes sure that all validation cases occur in a time period after the time of the cases used to create models
Candidate Models
improve understanding of what combinations of data, preprocessing, parameters, and algorithms work well when constructing models
-
-
Model Selection Process
even though each different type of algorithm has very different run-times and processing needs, each is assigned here to a “worker.”
The sample being used is randomly selected from each of the folds being used to train each model, with the full set of the final fold’s cases being used to evaluate the models
-