Please enable JavaScript.
Coggle requires JavaScript to display documents.
Text Detection and Character Recognition in Scene Images with Unsupervised…
Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning
Problem
Detection of text and identification of characters in scene images is a challenging visual recogniiton problem.
To solve this problem...
To solve this problem authors apply methods recently developed in machine learning - specifically, large-scale algorithms for learning the features and show that they allow them to construct highly effective classifiers for both detection and recognition to be used in a high accuracy end-to-end system.
Related Work
Survey some related work in scene text recognition, as well as the machine learning and vision results that inform their basic approach.
Desription of the learning architecture used in their
experiments.
Feature Learning
Feature extraction
Text detector training
Character classifier training
Authors' system proceeds in several stages
Apply an unsupervised feature learning algorithm to a set of image patches harvested from the training data to learn a bank of image features.
Evaluate the features convolutionallty over the training images. Reduce the number of features using spatial pooling.
Train a linear classifier for either text detection or character recognition.
Experimental Results
Authors present experimental results achuieved wuth the their system, demonstrating the impact of being able to train increasing numbers of features.
Specifically, for detection and character recognition, they trained their classifiers with increasing numbers of learned features and in each case evaluated the results on the test sets for text detection and character recognition.
81.7% ; 81.4% ; 85,5%.
Conclusion
Authors reach high accuracy, using developed system based on scalable feature learning algorithm.
Also they say that with more scalable and sophisticated feature learning algorithms currently being developed by machine learning researchers, it is possible thet the approaches pursued here might achieve performance well beyond what is possible through other methods that rely heavity on hand-coded prior knowledge.