Did you know that Optical Character Recognition (OCR) — the process of using machines to translate written words into different forms of information — is over 100 years old?! The first devices invented automatically turned text into Morse Code that could be telegraphed around the world.
The Optophone (c. 1914) transformed text into tones so that blind people could "hear" written material.
Photo Credit: Artist Unknown, scanned from Vetenskapen och livet, 31 December 1921
Thanks to machine learning, as well as advances in artificial intelligence and raw computing power, OCR has come a long way, but its core mission remains the same: unlock information that is trapped on paper with a high accuracy rate, while reducing or eliminating human intervention.
Hand someone a business card and they know instantly what it is, what it is for and what information is on it – without even glancing at it. The human mind recognizes the shape and fills in the blanks.
Advanced scanning software does much the same thing. Based on the size and shape of each scan, the software can deduce whether an object is a business card, driver's license or intake form. Gathering a "big-picture" understanding of the document helps the program predict what type of information it should look for and where it should look.
Layouts — the relative sizes and positions of objects on a page — provide even more context for the scanning software to identify information. For instance, centered type at the top of a long strip is most likely a business name and address. A number column in the right margin probably contains prices. The scanning software combines these patterns into a running hypothesis of what it's seeing.
Data extraction is the most well-developed step in the process (it's been around for 100 years, after all). In this process, scanning software recognizes individual numbers and letters with exceptional accuracy. The next step however, determining what those characters actually mean, is what separates the "OK" software from truly timesaving solutions.
Based on these three factors, pattern recognition, layout and the raw data, the best software is able to decipher the information and transpose what it extracts into data a computer (or a human) can search, review and use. Once transposed, this data can flow into a spreadsheet, populate a sales database or be fed into your accounting system. The possibilities for such data, once it is accurately interpreted, are virtually endless.
Yes, the best software can capture, extract and interpret bulk scans of mixed documents with a high degree of accuracy. However, the software's work should still be reviewed and verified by a human being to ensure complete accuracy. A "proofing" view places the scanned image side-by-side the interpreted data to make the job easier.
Everything extracted from a scan, from the tips on restaurant receipts to multi-page articles and whitepapers, becomes metadata for your document management system. This allows employees to find the documents they need quickly and easily.