Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 10: Splitting and Extracting New Columns (IF-THEN statements and…
Chapter 10: Splitting and Extracting New Columns
IF-THEN statements and One-hot encoding
From Text: "They allow for the examination of a value in a column and the ability to make changes to this value to other values elsewhere in the dataset." pg. 96
From Text: "IF-THEN statements allow you to create content in a new column depending on what exists in one or more other columns." pg. 96
IF, THEN, ELSE, ENDIF
One-hot encoding:(From Text:)"One-hot encoding is the conversion of a categorical column containing two or more possible values into discreet columns representing each value" pg. 97
From text: "From text: "For each unique category in the column, a new column is created with binary values or 1 or 0 (TRUE or FALSE). In statistics this is called "dummy" encoding. The approach is used because some machine-learning tools are not able to properly analyze the content of multi-categorical features." pg. 97
10.2: Transformations
About the various transformations available to create new features
Addition(+): (From Text:) "By adding different columns, predictive signals can be increased" pg 107
Subtraction(-): (From Text:) "By subtracting one column from another, the similarity or difference between them becomes more apparent; the closer the number is to zero, the more similar" pg 107
Absolute(Abs()): (From Text:) "Similar to subtraction but used in cases where the actual distance between two numbers rather than whether it is negative or positive is of importance" pg. 107
Multiplication(*): (From Text:) "Sometimes two related columns interact with a target in a way that is only detectable through their product. This interaction effect is often called a moderated relationship between a column and the target. The moderation comes from the size of another feature" pg. 108
Division(/): (From Text:) "By dividing one column by another, sometimes information that is otherwise hidden from some types of algorithms can be made available." pg. 108
Less than (<): (From text:) "If the number of seats available in a family’s largest car is smaller than their family size after the birth of a new child, this feature may be predictive of the purchase of new car." pg. 108
Less than or equal (<=)
greater than (>) look at page 109 for more
greater than or equal (>=) look at pg. 108 for more
equal (==) see page 108 for more info
not equal (!=) look on pg. 108 for more info
Exponentiation (**) see page 108 for more info
Natural logarithm (Log()) see page 109 for more info
Square root (Sqrt()) see page 109 for more info
Square (Square()) see page 109 for more info