Please enable JavaScript.
Coggle requires JavaScript to display documents.
Similarity, Neighbors and Clusters (Different business tasks focus on…
Similarity, Neighbors and Clusters
Different business tasks focus on different things
retrieve similar things i.e. competition
classification and regression
clusters
recommendations
reasoning
Similarity and Distance
similarity between objects
instances with similar class labeles
distance between objects
square root of (xA - xB)2 + (yA - yB)^2
compute the distance of individual dimensions
Euclidean distance between two or more points
methods for organizing the space of data instance
nearest-neighbor reasoning
use to find companies most similar to our best corporate customers or online consumers most similar to our best retail customers
once found, make decisions
similarity predictive modeling
classification
probability estimation
regression
Predictive Modeling
Classification
Probability Estimation
Regression
how many and how much influence?
odd numbers are convenient
the greater k is the more the estimates are smoothes out
maximum is k=x
Issues with Nearest-Neighbor Methods
intelligibility
reasoning about similar historical case is a natural way of coming to a decision
Dimensionality and Domain Knowledge
numeric attributes may have vastly different ranges unless scaled appropriately
computational efficiency
no effort is expended in creating a model
Technical Details relating to similarities and neighbors
heterogeneous attributes
manhattan distance (Ll norm)
Jaccard distance
cosine distance
combining functions
majority vote classification
majority scoring function
similarity-moderated classification
similarity-moderated scoring
similarity-moderated regression
Clustering
Hierarchical Clustering
groups points by their similarity
overlap when one cluster contains other clusters
Clustering around centroids
similarities between the individual instance and how similarities link them together
results
dendrogram
set of cluster centers plus corresponding data points for each cluster