Please enable JavaScript.
Coggle requires JavaScript to display documents.
Similarity, Neighbors, and Clusters (The fundamental concept of similarity…
Similarity, Neighbors, and Clusters
Fundamental concepts
Calculating similarity of objects described by data; Using similarity for prediction; Clustering as similarity-based segmentation.
Exemplary techniques
Searching for similar entities; Nearest neighbor methods; Clustering methods; Distance metrics for calculating similarity
Data mining procedures often are based on grouping things by similarity or searching for the “right” sort of similarity.
Following the basic procedure introduced above, the nearest neighbors are retrieved and their known target variables (classes) are consulted.
The goal is to predict whether a new customer will respond too a credit card offer based on how other, similar customers have responded.
One benefit of nearest-neighbor methods is that training is very fast because it usually
involves only storing the instances.
Before concluding a discussion of nearest-neighbor methods as predictive models, we
should mention several issues regarding their use.
-
Nearest-neighbor methods typically take into account all features when calculating the
distance between two instances.
numeric attributes may have vastly different rang‐
es, and unless they are scaled appropriately the effect of one attribute with a wide range
can swamp the effect of another with a much smaller range.
-
-
-
-