Please enable JavaScript.
Coggle requires JavaScript to display documents.
Feature Engineering Techniques (Imputation (Types (Numerical (ex:Column…
Feature Engineering Techniques
Imputation
Types
Numerical
ex:Column with values '1' & N/A.
Possible that N/A's are supposed to be '0'
Categorical
Replace missing value with "most occurred"
or "Other" if uniformly distributed
Assigning relevant values missing data
Handling Outliers
Data point that differs significantly from other observations (More than 3 times the standard
deviation)
Standard Deviation
Percentiles
Drop or Cap ?
Binning
Transform numerical values to categorical eqivalents
Log Transform
Transform highly skewed data to less skewed
Encoding
Categorical to numeric
Integer Encoding
One-Hot Encoding
Grouping Operations
Grouping repeated multiple
instances(row) in to one row
Key Factor: Identify the right aggregation
Feature Split
Split data to be more meaningful for models
e.g: Full name to "First", "Middle" & "Last"
Scaling
To makeup for missing range in the data
Important for algorithms based on distance of date (K-NN, K-Means)