Noisy Image Classification
Medical image datasets
Increasing in size and availability
Training for deep learning models yielded better accuracy scores
Labeling can now be automated
Use of Natural Language Processing (NLP)
Prone to noises
Use of Chest X-ray 14 dataset
Known for noisy sample
Contribution: estimating performance of clean test set from noisy test set
Deep Neural Networks (DNN)
Depend on dataset being fed into models
Challenges
Difficult/expensive to extract
Time-consuming to label
Even specialized radiologists find it difficult
Challenges
Noisy samples lead to overfitting
Noisy samples lead to low accuracy performance
Methodology
Experiment and Results
Classification of multiple class images
Use of early learning regularization (ELR)
Noisy label learning
Robust loss function
Transition matrices
Sample selection
Cannot be used for multiple-class labels
Labels can be extracted through NLP
Can be unreliable
Prone to errors
Estimating accuracy of clean test set
Lower bound estimation
Function depending on several variables
Algorithm to regularize cross-entropy loss
For boosting clean gradients
For dampening noisy gradients
Accuracy of noisy test set
Value of delta mapped from 0 to 1
Size of test set
Noise rate
Noise transition matrix
Dataset consists of 15 classes
Initial 14 classes
Addition of "No Label" classs
ELR outperformed several state-of-the-art models
Lower bounds for clean test set accuracy
Lower than noisy test set accuracy
Note: Better if they just removed the noise (?)
Overall: The paper proposes a new approach in which removal of noise is no longer necessary. However, they should have compared their results with a removed noise test set
Critique: High accuracy is necessary especially for highly sensitive information such as medical images. If deployed in a real setting, their approach, although novel, should yield higher results.
Liu, F., Tian, Y., Cordeiro, F. R., Belagiannis, V., Reid, I., and Carneiro, G. Noisy label learning for large-scale medical image classification, 2021.