Please enable JavaScript.
Coggle requires JavaScript to display documents.
Machine Learning: Chapter 10: Data Transformations (Transformations (log()…
Machine Learning: Chapter 10: Data Transformations
Depending on Type of Data
Text: Extract New Columns from columns containing text. A number of transformations are generally necessary to find predictive ability within text.
Categorical: Combine categories; turn multi-categorical columns in binary columns
Numeric: perform math
Splitting and Extracting New Columns
If-The Statements and One-hot encoding
One-hot encoding: converting categorical columns into discreet columns representing each value
Dummy variables: each unique category a new column is created with binary values
Improves predictive ability
RegEx
replaces, extracts, finds content from text
find patterns: beginning of line, end of line
any text with a predictable format can be extracted
Transformations
comparison of two columns can create a new column with predictive ability
log()
linearize exponential data
Sqrt()
similar to log, used for different distributions
Square()