Please enable JavaScript.
Coggle requires JavaScript to display documents.
Ch 16: Understanding the Process (Learning curves & speed (two forms…
Ch 16: Understanding the Process
Learning curves & speed
two forms of additional data
additional features
additional cases
the greater the amount of relevant data at the outset of a project, the less likely additional data will improve predictability
Learning Curves screen
validation scores on the Y-axis
Y-axis, lower scores are preferable
LogLoss is a ‘loss’ measure (every mistake in prediction increases the ‘loss score’)
percent of the available data used as the X-axis
Accuracy tradeoffs
Speed vs. Accuracy tab
how rapidly the model will evaluate new cases after being put into production
Imputation
impute missing values
Standardization
to see that all numeric features are standardized
One-hot encoding
any categorical feature that fulfills certain requirements, a new feature is created for every category that exists within the original feature
James Frainey
Jafr4672@colorado.edu