Please enable JavaScript.
Coggle requires JavaScript to display documents.
Ch.16 (blueprint (the model blueprints addressed at the start of this…
Ch.16
blueprint
-
imputation
Go to the highest ranked Regularized Logistic Regression model (model M56) and click on its name to open the blueprint page
This model blueprint contains more information. In this case, categorical features are one-hot encoded, and numerical features have their missing values imputed.
-
standardization
Now, click on the Standardize box to see that after imputing missing values, the numeric features are all standardized
it becomes clear that some algorithms, such as Support Vector Machines and some linear models will struggle with features that have different standard deviations.
Each feature is therefore “scaled,” which means that the mean value of the feature is set to zero and the standard deviation is set to “unit variance,” which is a fancy way to say 1.
one-hot encoding
for any categorical feature that fulfills certain requirements, a new feature is created for every category that exists within the original feature
it will shows two values, "yes" or "no", true or false
-
-
-