Please enable JavaScript.

Coggle requires JavaScript to display documents.

Intro To Machine Learning Workflow - Coggle Diagram

- - - - understand where the data came from
  - - - 2x2 grid that displays
        
        TP
        
        TN
        
        FP
        
        FN
    - - Precision
        
        TP/(TP + FP)
        
        error rate of algorithm
        
        Of the elements classified as a particular class, how many did we get right?
      - Recall
        
        TP/(TP + FN)
        
        error rate when the data is the expected value
        
        The number of images classified correctly divided by the total number of images.
      - Accuracy
        
        (TP + TN) / (TP + FN + FP + TN)
        
        how good algorithm is at predicting the correct thing
        
        The number of correctly classified images over the total number of images.
        
        Only for classification
    - - green boxes are ground truths
      - red dashed boxes are the model's prediction
      - Uses precision and recall from classification
- - - - weather/light conditions
      - sensor
      - environment
- - - - when the model does no generalize well
      - fits data trained exactly
      - good at predicting and handling data
      - chosen model is too complex
    - - why is it hard to create a balance model
      - TestError = Var + Bias + epsilon
        
        error rate of model on test data set
        
        variance - sensitivity to training data
        
        Increases with complexity
        
        low variance means the model can adapt to new data easily
        
        bias - quality of the fit
        
        decreases as model becomes more complex
        
        Test error - error rate on the test data
        
        parabola
        
        has an optimal complexity
    - - technique to evaluate how well a model generalizes
    - - include 80-90% of data in the training set
      - include 10-20% of data in the validation set
      - used for cross-validation to select the best parameters or compare models