Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 3: Predictive Modeling (identifying informative attributes…
Chapter 3: Predictive Modeling
identifying informative attributes
Typically something we do not want to occur
Information reduces uncertainty
Entropy: a purity measure that measures disorder to a dataset
you want to reduce entropy
Perfectly even distribution of variables gives the dataset an entropy of 1.
Entopy shows how much information gain is created from a dataset
for numeric variables, variance mesures impurity
Segmenting data by progressive attribute selection
Tree structured models
multiple attribut attribute selection
each leaf contains a variable for the target variable
each leaf contains a segment classification
Leafs should be homogenous
Attributes/ target variable
Trees can also create a set of rules. If/then statements
probability rather then a definitive yes/no
quality of the variables individually
Douglas Beighle
Model: a simplified version of reality
Predictive model: a formula for estimating the target variable
Classification models
regression models
Supervised Learning model creation occurs to find a relationship between a set of variables and predefined variable. " target variable.
:fire:
The fundamental concept: how do we know if a variable contains important information about the target variable
Highest information gain feature (HOUSE)
root of the tree