OCR

Introduction

Optical character recognition is the recognition of printed or written text characters by a computer.

This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing

How OCR Works?

OCR, is a method of converting a scanned image into text.

When a page is scanned, it is typically stored as a bit-mapped file in TIF format.

When the image is displayed on the screen, we can read it. But to the computer, it is just a series of black and white dots.

advantages.

Faster Searches

Reduced Cost

Reduced Errors

More Storage Space

Ready Availability.

Efficient Management

Security.

Functions of OCR

Disadvantages.

Limited Documents

Not Accurate

poor quality documents can create enough errors to require lengthy and time-consuming proofreading.

Handwriting and non-Latin fonts are particularly difficult to scan correctly.

lack significant contrast between characters and the background.

Dirty pages, or those printed on colored stock, may confuse a scanner

Forms containing characters images can be scanned
through scanner.

software works with your scanner to convert printed characters into digital text.

allow to search for or edit your document in a word processing program.

Features of OCR

The technology provides a complete form processing and
+documents capture solution.

OCR uses a modular architecture that is open scaleable and workflow controlled.

It includes forms definition, scanning, image
pre-processing, and recognition capabilities.

System Requirement.

Hardware Requirement

Scanner Requirement

Software Requirement

Additional workload to data collectors

Techniques.

pre-processing.

post-processing.

character recognition.

application-specific optimisations.