Overview
Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images or text into machine-encoded text, whether from a scanned document, PDF, or a photo of a document.
Existing OCR technologies are unable to recognize common layouts like forms and tables, and usually generate a lengthy text dump. What organizations want instead is the ability to accurately identify and extract text and data from forms and tables in documents of any format and from a variety of file types and templates.
Using advanced machine learning, Amazon Textract uses OCR technology that goes beyond traditional software by not only identifying each character, word and letter but also the contents of fields in forms and information stored in tables for scanned images, documents and PDF’s.
Use Cases
Kabbage
"Amazon Textract helped us support 80% of PPP applicants to receive a fully automated lending experience and reduced approval times from multiple days to a median speed of 4 hours. By the end of the program, we became the second largest PPP lender in the nation by application volume, surpassing major US banks —serving over 297,000 small businesses, and preserving an estimated 945,000 jobs across America."
Anthony Sabelli, Head of Data Science for Kabbage

Change Healthcare
"At Change Healthcare, we believe that we can make healthcare affordable and accessible to all by improving the timeliness and quality of financial and administrative decisions. This can be achieved by the power of machine learning technology to understand more from our data. But unlocking the potential of this information can often be difficult as it's siloed in tables and forms that traditional optical character recognition hasn't been able to analyze. Amazon Textract further advances document understanding with the ability to retrieve structured data in addition to text, and now with the service becoming HIPAA compliant, we'll be able to liberate the information from millions of documents and create even more value for patients, payers, and providers.”
Nick Giannasi, EVP and Chief AI Officer - Change Healthcare

Filevine
"Millions of matters and case files are handled in Filevine every day. We chose Amazon Web Services because we wanted to deliver best-in-class document search solutions for our customers. Amazon Textract is fast, accurate, and scalable - it helps Filevine meet the exacting requirements of the world’s largest and most sophisticated legal organizations. With Filevine and Amazon, finding the proverbial needle in the haystack has never been easier for legal professionals."
Ryan Anderson, Chief Executive Officer - Filevine

Benefits