Data Transformations

Kinds of things you can do with data

Text

Extract new columns from columns containing text

Categorical

Combine numerous categories into fewer categories

Turn multi-categorical columns into several binary columns

Numerical

Add, subtract, multiply two or more columns to create new columns

IF-THEN statements

Allow for the examination of a value in a column and the ability to make changes to this value or other values elsewhere in the dataset

Create content in a new column depending on what exists in one or more other columns

One-hot encoding

The conversion of a categorical column containing two or more possible values into discreet columns representing each value

For each unique category in this column, a new column is created with binary values of 1 or 0 (true or false)

Transformations available to create new features:

Operators applied to one column

Natural logarithm

Operators applied to two columns

Addition (+)

Subtraction (-)

Absolute [Abs()]

Multiplication (*)

Division (/)

Less than (<)

Less than or equal (<=)

Greater than (>)

Greater than or equal (>=)

Not equal (!=)

Equal (==)

Exponentiation (**)

Log()

Square root

Sqrt()

Square

Square()