Please enable JavaScript.
Coggle requires JavaScript to display documents.
Larsen Chapters 13 - 15: Data Robot (Chapter 15: Builder Candidate Models,…
Larsen Chapters 13 - 15: Data Robot
Chapter 13: Startup Process
Uploading Data
For this class, analyzing 10 small datasets will be more useful than analyzing 1 big one
Data Robot accepts csv's, tsv's, .xls, .xlsx
Methods
Easiest way - read a local file
Compressing data will speed up the process (.zip)
Alternatively - use a direct URL for data that is on the web
Chapter 14: Feature Understanding and Selection
Descriptive Statistics
[ ] - Inclusive Range
( ) - Exclusive Range
Data Robots allows us to analyze data extensively in a number of numerical ways
Data Types
Data types indicate the nature of the data inside of a DataRobot feature
Can be binary, categorical, multi-class categorical etc
Most common data type is numerical.
Chapter 15: Builder Candidate Models
Starting the Process
DataRobot offers the option of which metric to optimize the produced models for
Usually will auto select a good measure
Advanced Options
There are all kinds of advanced features that can be used such as Group, Date/Time, Partition
Importance Column
Indicates the relative importance of a particular feature when compared against the target
This value is calculated using logistic/linear regression using the ACE algorithm
Model Selection Process
DataRobot autopilot can be thought of a five-round tournament to determine the best approach and algorithm
Model Blending is the last step of the process, at which point human understanding is impossible.