Please enable JavaScript.
Coggle requires JavaScript to display documents.
Ch. 16 Understanding The Process (learning curves and speed (rule of…
Ch. 16 Understanding The Process
learning curves and speed
additional features
additional cases
rule of diminishing marginal improvement
greater the amount of relevant data at the outset of a project, the less likely additional data will improve predictability
accuracy tradeoffs
speed vs accuracy
speed of slowest model
compare with the reality of the predictive needs required by the given project
always look for efficient frontier line
blueprints
for missing values, it imputes them either by replacing them with an average for the column
"one hot encode"
any categorical feature that fulfills requirements, new feature is
created for every category that exists within the original feature
large negative number to all values
imputation
uses median value
missing values placeholder
"indicator"
standardization
each feature is scaled
mean value set to 0 and std deviation 1
hyperparameter optimization
advanced tuning