Please enable JavaScript.

Coggle requires JavaScript to display documents.

Chapter 13-15 (Chapter 13. startup processes (easiest way to bring a…

- - - - Unique helps us find duplicates and weed them out of the data
      - When the target feature is selected, it will be tagged as target. also, though, the code, "too many values can also be encountered"
      - The "missing" column can tell how many values are missing from a specific dataset. filtering out this data will pay dividends
      - Algos that struggle w missing values include regression, neural networks, and support vector machines. it finds nulls and ? marks. this is especially relevant when joins are done
      - Sometimes if there's missing data and algos can't handle missing values, DR will impute values before running.
- - - - 15.4 Model Selection process
        
        16 % samples and 32 % samples are used to train the model, while the full set is used to evaluate the models.
        
        On DR, there are symbols from other R-based models. Such as Vowpal, TensorFlow, XGBoost and blender models.
        
        From 32 % predictions, the model moves to 64 %
        
        Cross validation is now ready if the validation data set is samll, meaning less than 10 k cases
        
        Then a blender is used for the average probability score.of each model's prediction