Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 6 Similarity, Neighbors, and Clusters (Clusters (Hierarchical…
Chapter 6 Similarity, Neighbors, and Clusters
Similarity
Business tasks invlove similarity
retrieve similar things directly
classification and regression
group similar items together into clusters
provide recommendations for similar products and similar people
medical and law use similar case to make decisions
Compare Similarity
Euclidean distance
Neighbors
Nearest neighbors
Functions
Case: analysis the features of Whiskies
Prediction
Classification: classify new instance in a simple setting
Probability Estimation
Regression
Issues
Justification of a specific
decision and the intelligibility of an entire model.
Having too many irrelevant attributes
Solution: feature selection; tune the
similarity/distance function manually
Classification expense is very high
Clusters
Find groups of objects in dataset; Notion of similarity
Case: Whiskey Analytics- exploration of data analysis
Hierarchical Clustering: grouping points by similarity
Highest level: single cluster contains everything
Lowest level: remove all circles and points
Dendrogram: shows explicitly
the hierarchy of the clusters
Distance function between clusters: Linkage function
centroid-based clustering
Group of new stories released by a particular company