Please enable JavaScript.

Coggle requires JavaScript to display documents.

Provost Chapter - Clustering (Edit distance (Great for typos, street…

- - - - Draw a right triangle, find the longest side via Pythag. thrm
        
        By assigning a distance between points based on relevant points:
        
        This allows us to find items that are alike to the original pt (based on approximate value)
      - IS the most common, but there are about a dozen others used in mining
  - - - This is determined by the data scientist (odd # for tie, etc)
      - Shorthand = k
      - This sets the complexity - parameter
      - Has its shortcomings when it comes to unique cases/fields
        
        House loans - denial based off of being similar to other families
    - - Has no strict "learning" about any one customer/base
- - - - The layers are set inside one another, deeper = narrower classification
      - An advantage of hierarchical clustering is that it allows the data analyst to see the landscape
- - - - Sorts in this tree form