Please enable JavaScript.
Coggle requires JavaScript to display documents.
Ch. 14 Feature Understanding and Selection (Descriptive Statistics (order…
Ch. 14 Feature Understanding and Selection
Descriptive Statistics
all features are listed in rows with their names under the Feature Name header
order they were read into DataRobot under the Index header
index number will be used to specify which features are being discussed
Unique function
any feature name can be clicked to display more details about the feature
Numeric Columns
Mean, Std Dev, Median, Min, and Max
Categorical Data
Mode, Std Dev, Median, Min, Max
[] - inclusive
() - exclusive
Data Types
indicates the nature of the data inside of a feature
categorical data
2 categories = binary categorical
many categories = mult-class categorical
Numeric
Any type of number (integers and decimals)
(ensure categorical data is not interpreted as numeric)
Boolean
categorical feature that always holds one of two values: true or false
Text type
Capable of capturing currency through currency symbols
Capable of capturing measurements based on <feet>'<inches>''
Extract numeric values for dates (days of week, month, year, etc)
Evaluations of Feature Content
Missing Values
avoid mislabeled data
avoid problems when running regressions, neural networks, and support vector machines
James Frainey
Jafr4672@colorado.edu