Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 12: Data Reduction and Splitting (Filtering (Split set of data in…
Chapter 12: Data Reduction and Splitting
Unique Rows
Data contains Duplicate rows
Remove the duplicates
Partial match removal
Remove full rows; identical content in some columns
rows to keep first
rows to remove next
unique function
Summarize function
Complete match removal
remove full rows. All columns identical
Filtering
Split set of data in 2 based on its characteristics
Union training and test data
repeparate once data modifications are complete
test and train data
data transformations apply to subset of data
two separate tables
Use union to combine training and test
filter element in new training set
split further based off data in columns