Please enable JavaScript.
Coggle requires JavaScript to display documents.
Features Engineering - Coggle Diagram
Features Engineering
basics knowledge
what is feature engineerings
choosing the right features for our model
creating the right features for our model
why feature engineering
improve model
difference with data cleansing
feature engineering focused on choosing/creating the right features
data cleansing focused on making the data more readable, etc
main process
brainstorm possible features
possible new features
possible features candidates
process the features
implementation phase
advised immutability
new dataframe
validate on the model
categorial encoding
count encoding
frequency
target encoding
mean/average probability
catboost encoding
mean/average with time
feature selection
activity on choosing which features to use for the model
motivation
overfitting
too many features
underfitting
too few features
meaningless feature
tips for feature selection
Univariate Feature Selection
selecting a high correlation feature towards the target
handling null data
drop
drop when most data is null
drop when data is possibly not available / 0
recommended to impute it when value like 0 / false
impute
drop when data is possibly not recorded
imputing considerations
use median for numerical variable
use mode for categorial variable
scaling & normalization
scaling
changes data range
motivation
comparing different metrics is bad
normalization
changes distribution to normal
motivation
some machine learning model only works well on normally distributed data
feature generation
what is feature generation
activity creating new features based on another feature
based on either row / columns of other feature