Please enable JavaScript.
Coggle requires JavaScript to display documents.
Provost CH6: Similarity, Neighbors, and Clusters (Combining Functions:…
Provost CH6: Similarity, Neighbors, and Clusters
Introduction
Fundamental Concepts:
Calculating similarity of objects described by data; Using similarity for prediction; Clustering as similarity-based segmentation
Exemplary Techniques:
Searching for similar entities; Nearest neighbor methods; Clustering methods; Distance metrics for calculating similarity
-
-
-
-
-
-
Geometric Interpretation, Overfitting, and Complexity Control
-
-
Clustering
supervised segmentation—finding groups of objects that differ with respect to some target characteristic of interest
want to find groups of objects, where the objects within groups are similar, but the objects in different groups are not so similar
Hierarchical Clustering
-
consider “clipping” the dendrogram with a horizontal line, ignoring everything above the line
-
-
-
Summary
-
common proxy for the similarity of two entities is the distance between them in the instance space defined by their feature vector representation
how the same fundamental concept—similarity—also underlies the most common methods for unsupervised data mining: clustering