Please enable JavaScript.

Coggle requires JavaScript to display documents.

Introduction to Predictive Modeling: From Correlation to Supervised…

- - - - Sometimes called feature vector
  - - - Induction algorithm, procedure that creates the model
        
        Training data, input data for IA
- - - - Ex: middle-aged professionals who reside in NYC
    - - Rank variables by how good at predicting value of target
      - Selecting Informative Attributes
        
        Binary stick person example
        
        Want resulting groups to be pure, homogenous, with respect to TV
        
        If one is off, entire group impure
        
        Rarely find pure data
        
        Make it as pure as possible
        
        Attributes rarely split a group perfectly
        
        Not all attributes binary
        
        Many have 3+ distinct values
        
        Some attributes take on numeric values
        
        Formula evaluates how well attribute splits segments
        
        Formula based on purity measure
        
        1 more item...
      - Attribute Selection w/ IG
        
        Rank by IG to simplify
        
        Mushroom Example
        
        Target Var. - edible
        
        Values - yes (edible) no (poisonous)
        
        1st calculate entropy
        
        0.96 entire entropy
  - - - Segments of data take form of a tree
        
        Classification trees used as predictive models
        
        Nonleaf nodes referred to "decision nodes"
        
        Provide model that represents sort of supervised segments we want
        
        Divide-and-conquer approach
        
        Visualizing Segmentations
        
        Only possible to visualize 2-3 dimensions
    - - Trace down single path from root node to leaf collecting conditions as we go
      - Consists of attribute tests along the path connected with AND
        
        Ex: If (Balance < 50k) AND (Age < 50) THEN Class-Write-off
  - - - Can use in a more sophisticated decision-making process
        
        Want each segment (leaf of tree) an assigned probability
        
        Frequency-based estimate
        
        Could lead to overfitting
    - - p(c) = (n+1) / (n+ m + 2)
        
        N= # of examples, c= class, m= number of examples not belonging to class c