Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 10: Data Transformations (transformations (addition (predictive…
Chapter 10: Data Transformations
transformations
2 columns can lead to a useful 3rd column
addition
predictive signals can be increased
Subtraction
similarities or differences become more apparent
Absolute
used in cases where actual differences matters
Multiplication
moderated relationship between column and target
Division
can reveal hidden info.
Less than
used to help with family sizes
Less than or equal
also useful for family size
predictive of bed purchases in a new house
Greater than
Ex. can predict if family will take road trip based on # of seats in car
Greater than or equal
Ex. can predict likelihood or purchasing new car based off # of seats
not equal
can have an effect on the prediction
Equal
can cancel each other out in some cases
or may indicate higher likelihood or something happening
Exponentiation
compound interest rates are good example of this
IF-THEN
best friend
creates content in new column
one-hot encoding
creates discreet new columns representing individual values
single column transformations
Natural logarithm
linearize exponential data
Square root
compare ability to linearize data vs. log
Square
makes large values larger