Please enable JavaScript.
Coggle requires JavaScript to display documents.
Ch13-15 (Ch13: Data Robot (Use Chrome (Load Local File(easiest), HDFS,…
Ch13-15
Ch13: Data Robot
Use Chrome
Load Local File(easiest), HDFS, ODBC, or URL
.csv format
store text data as a tab separated file .tsv
link data from website using URL
SQL database use ODBC
HDFS for Hadoop
compress the data for slow internet
Ch14: Feature Understanding and Selection
Unique column shows all unique values
When using histograms change number of bins accordingly to show the cutoffs between them
[] is inclusive () is exclusive
Data Types: Categorical, Numerical, Boolean, Text
DataRobot only shows first 50 features by default
auto-codes currency, measurements, dates
Few Value= unique values Duplicate Value= other features with value
code missing values as one consistent type
Ch15: Build Candidate Models
AutoPilot and QuickRun work fast running only a sample of the data
Step 1: sets target feature
Step 2: create CV and holdout partition
Step 3: characterize target variable
Step 4: load dataset and prepare data
Step 5: save target and partition info
Step 6: calculate importance scores
Step 7: calculate list of models
1 more item...