Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 12 (Sampling (Holdback, validation and cross validation (holdout…
Chapter 12
Sampling
select smaller datasets
mirror charachteristics of a population
can preserve processing power
critical to retain rows
Possibly down sampling
Machine learning uses this by using smaller data sets to learn
Holdback, validation and cross validation
holdout sample
cross validation validates the data we wanted
Filtering
Spliting into 2 tables
sorting characteristics
data transformations
filter ranges
Data reduction and splitting
Unique Rows
Partial match removal
Complete match removal
Summarize function
Unique function
Product Ids from past data