Please enable JavaScript.
Coggle requires JavaScript to display documents.
Decision Tree Algorithm (Entropy (Measure how informative is a node, Use…
Decision Tree Algorithm
Introduction
-
Type of
Categoies
Numerical Attribute
- Attribute whose domain
in numerical
Categorical Attribute
- Attribute whose domain
not numerical
Properties
Edge
- Represent a test on attribute
-
-
-
Entropy
-
-
-
Entropy(S) = - (p/t).log2(p/t) - (n/t).log2(n/t)
Entropy[14,0] = 0
Entropy[7,7] = 1
Selecting
Split
Degree of impurity in the child node
- Smaller degree of impurity, more skewed class distribution
Node with class distribution (0, 1) – zero impurity
Node with uniform class distribution (0.5, 0.5) – the highest impurity
Impurity Measures
Entropy
- ID3, C4.5
Gini Index
- CART, Breimon
-
-
Problem
Overfitting
- Training errors are small but test errors are large
- Too many branches
- Some may reflect anomalies due to noise
- Poor accuracy for unseen samples
Underfitting
- Arises when both the training errors & test errors are large
- When the developed model is made very simple
-
-
-