Please enable JavaScript.
Coggle requires JavaScript to display documents.
Data Transformations (Transforms (Subtraction (-, Similarity or difference…
Data Transformations
Transforms
Addition
+
Number of Spouses + children +1 = family size
Subtraction
-
Similarity or difference between columns is more apparent
Ex: Current Temp - Inside Therm. Temp = Likelihood to go outside
Absolute
Abs()
Actual distance b/w numbers is important
Multiplication
*
Moderation
Interaction Effect
Comes from sized of another feature
Division
/
Find hidden info
Income/ # kids = Available Funds
Less Than
<
Predictive
Less than or Equal
Greater Than
Greater than or equal
Not equal
!=
Equal
==
Exponentiation
**
P = C e^(rt), C
e**(r
t)
Abilities of New Features
Text
New columns from columns contain texts
Categorical
Combine numerous categories into fewer
Mult.-cat. columns into binary
Numerical
Add
Subtract
Mutliply
Create new columns by:
Split or extract content
Put data into 1+ columns
Transformations applied one column at a time
Natural Logarithm
Log()
Linearize exponential data
Higher a family's total income, less likely they'll visit national parks
At some point, love would trump income
Square Root
Sqrt()
Words for diff. dist. of data
Square
Square()
Makes Large values even larger
Square std. dev to get variance
Splitting and Extracting New Columns
IF-THEN Statements
Create content in new column
Splitting a column
One-hot encoding
Conversion of cat. column
Containing 2+ values in discreet column
Dummy Encoding
Predictive ability improves
Create True/False Statements
Transformations create new Features