Please enable JavaScript.

Coggle requires JavaScript to display documents.

ML (Can you explain the bias-variance trade-off in machine learning?) -…

- - - - a set of methods to prevent overfitting and improve the generalization performance of models
      - add constraints or penalties to the model's learning process, encouraging it to be simpler and less prone to overfitting
      - L1 Regularization
        (Lasso Regularization)
        
        a penalty is added to the model's cost function that is proportional to the absolute values of its coefficients
        
        encourages some of the model's coefficients to become exactly zero
        
        useful for feature selection because it tends to set the weights of less important features to zero
        
        can help create sparse models, which are models that use only a subset of the available features.
      - L2 Regularization
        (Ridge Regularization)
        
        a penalty is added to the model's cost function that is proportional to the square of its coefficients
        
        discourages large coefficient values and encourages all features to have small, non-zero weights
        
        helps prevent the model from becoming overly sensitive to the specific training data and reduces the risk of overfitting
        
        often used when you have many features, and you want to prevent any single feature from dominating the model
    - - used to assess the performance and generalization of a predictive model
      - particularly helpful in estimating how well a model will perform on unseen data
      - Data Splitting
        
        The first step is to divide the available dataset into two or more subsets: a training set and a testing set.
        
        testing set
        
        evaluating the performance
        
        training set
        
        to train the machine learning model
      - K-Fold Cross-Validation
        
        the training data is further divided into K equally sized "folds" or subsets
        
        the model is then trained and evaluated K times, each time using a different fold as the testing set and the remaining folds as the training set
        
        the results of each iteration are averaged to obtain a final performance metric
      - Performance Evaluation
        
        during each iteration of cross-validation, the model is trained on one subset (fold) and tested on another
        
        The evaluation metric(s) of interest, such as accuracy, mean squared error, or F1 score, is recorded for each iteration.
      - Averaging Results
        
        After K iterations, the performance metrics are averaged to obtain a more robust and reliable estimate of the model's performance.