Please enable JavaScript.
Coggle requires JavaScript to display documents.
Mindmap Chapter 3 (Predictive Modeling (Supervised segmentation…
Mindmap Chapter 3
Predictive Modeling
Supervised segmentation
segment population into groups
w/respect to one quality of interest
Supervised: need a target
How do we know if it contains important data?
Want results to be "pure"
As homogeneous as possible
Finding attributes of entities described in data
Informative
Reduces uncertainty
Called "Model Induction"
Tree-Induction
Nodes
Interior Nodes
Test of an attribute
Terminal Nodes
Terminates at a leaf
Branches
Values or range of values of an attribute
Classification or decision tree
Probability
Could be regression tree
Prediction: Estimate an unknown value
Not like descriptive modeling
Instance: described by a set of attributes
Also called a feature vector
Values of all attributes stated in data
Dataset: contains instances
Target: dependent variables
Features: independent variables
Input data: training/labled
Attributes rarely split a group perfectly
Not all are binary
Some attributes take on numeric values
Purity measures
Information gain/entrophy
Parent and child sets of data
Measures disorder of a set
Equation
Lots of Information gains on a set of data (IG)
Impurity = variance