Please enable JavaScript.

Coggle requires JavaScript to display documents.

4 - Modeling (SageMaker Training (Hardware (Performance: ASIC > GPU >…

- - - - tag attribute allows versioning of the image. Use ":latest" for latest image. Use ":1" for stable version (production purposes)
- - - - Specify training algo
      - Specify algo-specific hyperparameters
      - Specify the input and output configuration
- - - - Stratified Sampling - applies random sampling to each subgroup separately. It ensure that rare populations are not underrepresented in the training dataset
      - ensure that the data is equally mixed
    - - Training Data (70-80%)
      - Testing Data (20-30%)
    - - K-Fold with k representing how many times you fold the data
      - each round, you split the data using a different cross-section of a total dataset to ensure your model gets a variety of input
      - Error Rates - if error rates of Round1-4 are roughly the same, then we are pretty confident that our data was well randomized. If, on the other hand, one of the rounds has a significantly higher error rate, it means that we didn't have a well randomized dataset to begin with.