Please enable JavaScript.

Coggle requires JavaScript to display documents.

Linear Regression, Classification, Text Data ( Unstructured Data), ML…

- - - - Error gives confidence band
        
        Which gives CI for predicted value of Y
- - - - Identify redundant variables
  - - - Use weighted least square criteria or scallter plot to identift
  - - - Add variables.
      - Add non linear variables like log, product
- - - - regression
        
        Linear regression
      - classification
  - - - Continous
        
        Regression
      - Categorical
        
        Classification
      - Independent
      - Dependent
        
        "Labeled data"- already known for training
      - Thea
        
        Slope/Location of line
- - - - Lose valuable data for training
    - - Cross Validation 5-10
    - - Assessing parameter estimates
    - - Resampling
- - - - each word is represented as vector ( array 1, 0). Length of vector is size of dictionary.
      - Helpf for info retrival
      - Issues : Synonyms not accounted for, length too long of vector, importance of word is not considered, no relationship between words ( no order of words, meaning is lost)
  - - - How often word is in a document
        vs how often it appears in
        all documents ( less count = higher importance)
        
        Same lenth of vector.. but entries are fractions
- - - - Easy to explain?
      - Superior?
      - Both?
- - - - Single Category classification
      - Multiple classes
  - - - Kmeans
      - Guassian mixture modles