Please enable JavaScript.
Coggle requires JavaScript to display documents.
Robust Scene Text Recognition with Automatic Rectification (Problem…
Robust Scene Text Recognition with Automatic Rectification
Problem
Recognizing text in natural images is a challenging task with many unsolved problems.
Different from those in documents, words in natural images often possess irregular shapes, which are caused by perspective distortion, curved character placement, etc.
To solve this problem they propose RARE (Robust text recognizer with Automatic Rectification)
About the system
Their system consist of Spatial transformer network and a sequence recognition network.
The Spatial transformer network transforms an input image to a rectified image and sequence recognition network recognizes text.
Related Work
Proposed Model
In this part they fully showed the structure, principle and operation of Spatial Transformer Network and Sequence Recognition Network. Also they demonstrate how the model training passes.
Experiments
First, they evaluate their model on some general recognition benchmarks, which mainly consist of regular text, but irregular text also exists.
Next, they perform evaluations on becnhmarks that are specially designed for irregular text recognition.
Conclusion
The extensive results show that
1) without geometric supervision, the learned model can automatically generate more "readable" images for both human and the sequence recognition network
2) the proposed text rectification method can significantly improve recognition accuracies on irregular scene text
3) the proposed scene text recognition system is competitive compared with the state-of-the-arts.
They plan to address the end-to-end system scene text reading problem through the combination of RARE with a scene text detection method.