Please enable JavaScript.

Coggle requires JavaScript to display documents.

Chapter 15. Build Candidate Models (15.2 Advanced Options (not recommended…

- - - - The Date/Time option ensures that models’ training and validation
        data are appropriately split by time to avoid this.
- - - - Step 3, “Characterizing target variable,” is where 162 DataRobot will save the distribution of the target to the analysis system for later use in decisions about which models to run.
        
        Step 4, “Loading dataset and preparing data,” is relevant if a) the dataset is large (that is, over 500MB); b) all the initial evaluations before this step will have been conducted with a 500MB sample of the dataset (or all of the dataset are smaller than 500MB); and c) now the rest of the dataset is loaded.
        
        Step 5, “Saving target and partitioning information,” is where the actual partitions are stored in cross validation folds, and holdout sets are stored in a separate file on a disk.
        
        In Step 6, importance scores are calculated (discussed these in the next paragraph). The features have now been sorted by their importance in individually predicting the 163 target.
        
        Step 7, “Calculating list of models,” is where information from steps 3–6 is used to determine which blueprints to run in the autopilot process.