Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 13-15 (Chapter 14: Feature Understanding and Selection…
Chapter 13-15
Chapter 14: Feature Understanding and Selection
Descriptive Statistics
Feature Name
Index
Variable Type
Unique
Row ID
Visualizations
see page 137
Data Types
nature of data inside of a feature
noted in unique column
binary
multi-class categorical
numeric
make sure it is correctly categorized
Boolean
true of false
Evaluations of Feature Content
unique identifier
system wont use to predict your target
Missing Values
missing column
missing values
converting them all to one consistent type
null values
Chapter 15: Build Candidate Models
Starting the Process
select the target feature
DataRobot uses downsampling
LogLoss (Accuracy)
evaluated on probability
Advanced Options
cross-validation
Partition Feature
Group
Date/Time
Starting the Analytical Process
Prescribed Options
Autopilot
Quick
Manual
Model Selection Process
several algorithms running
autopilot
process as a five-round tournament to determine the best approach and alogithm
Tournament round
Round 2: 32% Sample
Round 3: 64% Sample
Round 4: Cross-Validation
Round 5: Blending
Chapter 13: Startup Processes
Uploading Data
local file
data robot will accept separate value files
URL
ODBC
HDFS
Create New Projects
tag it
Data in upper left corner