Please enable JavaScript.

Coggle requires JavaScript to display documents.

CH 3 (probability estimation (frequency-based estimate (use instance…

- - - - creates the model from the data
      - input data = training/labeled data
        
        value of target variable is known
  - - - creates model from general rules and specific facts
        
        creating other specific facts
  - - - less accurate preferred if easier to understand :
- - - - not every attribute is binary
      - atrributes very rarely split a group perfectly
      - some attributes take on numeral values
        
        formula that evaluates how good each attributes has spli
        
        with respect to target variable
        
        based on purity measures
    - - information gain
        
        based on purity measure = entopy
        
        Entropy can be used to measure
        improvement (decrease) in
        entropy over segmentation
        
        measure of disorder
        
        Disorder corresponds to how impure
        the segment is w/ respect to the target variable
        
        Just because pure doesn't mean
        shouldn't be split into two large
        relatively pure subsets
        
        how inform are attributes about target
        
        numeric variable
        
        categorize the umbers
  - - - Followed by branches which are
        distinct values of attribute
        
        leafs segment the data
      - Data with unknown classification can be
        predicted by finding segment
- - - - hyperplane equals separating surface