Please enable JavaScript.
Coggle requires JavaScript to display documents.
MINING ALGORITHM - Coggle Diagram
MINING ALGORITHM
Clustering
What is it?
Clustering- Unsupervised learning technique that consists of
grouping data points & creatung partitions based on similarity.
Where to use it?
-in numerical data- any categorical variable needs to be converted to a numeric variable by binarization.
-
How to use it?
Using two approaches:-
-
Hierarchical Approach
-
-typical methods: Diana, Agnes, BIRCH, CAMELEON
-
Decision Tree
How to use?
-
- Training samples start at the root node.
- The attributes must be categorical (all continuous must be discretized in advanced.
- Test attributes are selected based on the heuristic (trial and error) or statistical measure. (info gain in ID3/C4.5
Things to consider :
-
-
The possible partitioning scenarios depending on the
attribute type (nominal, ordinal or continuous) or
ways to split (2-way, multi way split, etc.)
-
What is it?
A flowchart-like tree structure where :
The top most node - root node
The bottom most node - leaf node (terminal node)
The node divided into sub-node - parent node
The sub-node - child node
-
Evaluation method
-
How to use it?
Accuracy
accuracy of a classifier is given as the percentage of total correct predictions divided by the total number of instances
-
-
-
When to use it?
-
as an integral part of many learning methods, which help find the model that best represents the training data