Please enable JavaScript.
Coggle requires JavaScript to display documents.
Understanding the Process (Learning curves and Speed (two forms of…
Understanding the Process
Learning curves and Speed
two forms of additional data
Additional features and additional cases
use rule of diminishing marginal improvement
under the leaderboard tab is "Learning Curves"
this shows validation scores and % of data used
lower scores are best because it shows less log loss
log loss curves look like an arm with an elbow
make sure to take numbers and apply them to reality and understand their implications
1 more item...
Accuracy Tradeoffs
speed vs accuracy tab shows how fast the model evaluates
efficient frontier line is drawn btwn dots closest to the axes
usually occurs when criteria is unrelated
models must be able to produce rapid predictions when cases are added
start by calculating the speed of the slowest model
compare results with the reality of the project at hand
ignore speed if it isnt important to your model
if time is important then the efficient frontier line will be helpful
1 more item...
Blueprints
click "blueprint" to see them
shows the inner workings of algorithms
data robot will sometimes code categorical features as one-hot-encode
data robot will also input data for missing values
clicking the links shows information on DataRobot's code
very helpful tool!!
if the data shows true / false values then the algorithm can look for predictions
the standardize box shows numeric features that are standardized after getting imputed
1 more item...