How OCR Can Help Employees Fight Through Most Mundane Tasks

optical character recognition

These days, office employees need an AI hero. Can you imagine the number of hours wasted on handling a paper-based workflow? Isn’t it time to save employees from piles of paper?

No one is saying it will be easy to eliminate paper documents promptly. For instance, in the legal sphere where the cost of a mistake is rather high, verification and validation of documents require human involvement. However, many other business areas are in dire need of intelligent automation of document flow processes to digitize tones of printed invoices, tickets, and receipts already now.

Businesses around the US and Europe waste too much time on managing traditional paper document workflow – around 46% – instead of focusing on core processes. And 65% of businesses sought to leverage the power of data and digitize paper documents in the year 2018. Is it enough to completely get rid of tons of paper in the coming years? No, it seems not.

As for a helping hand with this challenge, it is artificial intelligence (AI) that is developing by leaps and bounds year after year. Namely, it is machine learning (ML) and optical character recognition (OCR) technology. And OCR can become your employees’ best assistant in combating routine manual tasks.  

Take a picture, improve it, recognize characters

Let’s consider for instance a tax agency and all the receipts they deal with on a daily basis. Dozens of employees scan printed documents and gather information about types of services, discount, date, provider, and so on. Their work can and should be optimized to enhance productivity, save time and resources, and reassign employees to solve issues requiring human involvement.

And here is an AI-driven solution. The first step to digitizing paper documents is using OCR. Among some powerful out-of-the-box tools meant for text extraction there is Google Cloud Vision API with it’s following functionalities:

  • demonstrates a high-quality recognition process
  • detects the position of the text on a page
  • provides the level of confidence for character predictions

It may seem excellent unless compared to open source Tesseract and ABBYY FineReader Engine providing multifunctional SDK OCR for developers. These two return the same results but in a different format and can outperform the first one. 

However, all three solutions depend on an Internet connection and don’t support offline mode. What is more, they work great only with perfect raw images, while in reality, paper documents are too often crumpled and are full of poorly printed characters. Poor illumination or low-quality camera can also make characters difficult to recognize.

So, raw images should go through a preprocessing stage. On this point, OpenCV – an open source library of algorithms aimed at computer vision – can be effectively used as a framework for image processing. It provides the level of confidence that serves as an indicator while checking the quality of improved images. Another way is to take a human-in-the-loop approach to machine learning or, to put it plainly, engage a human to assist a machine in quality control.

ML algorithm in cooperation with human 

Let’s introduce another use case when OCR can become employees’ best friend. When an employee is on a business trip, a company is responsible for covering all the costs for hotels, tickets, transfers, meal, communication, etc. And again, such trips result in lots of paper documents, which triggers tedious manual processing. There is the other way round this problem: an employee can send back photos or images of scanned documents for OCR, thus helping optimize time and resources.    

What is interesting here is that all the information gathered via OCR requires splitting into several categories. Only after categorizing and analyzing it can be efficiently used to eliminate manual processing. But one can’t go without human involvement and human feedback. The cooperation between a human and a machine is known as supervised learning. To train an ML algorithm analyze and split data with higher accuracy, the data should be labeled by human annotators. They should also set strict rules to automatically tell a machine about one or another category.

An ML algorithm works on features (in this case, picture dimensions). For feature engineering, it is crucial to use as match domain knowledge as possible, such as the height and width, color, word count, fonts, spacing, time features, etc. Manually labeled pictures allow building a learning model for an algorithm. To start the model development, 1-2 thousand of labeled items will do. But to make it provide great results, at least 10-20 thousand is a must.  

Heading towards the future of AI-enabled assistants 

Last but not least comes another use case: tapping into OCR to analyze tickets and pay back travel expenses in case of a canceled trip. Still, sticking to a human-in-the-loop approach is wise, since everything about money requires reliable verification. It is necessary to train a machine to identify the date, time, station’s name faultlessly. All this points to the use of the rule-based approach to ML and the need for human input.

But just think that out of, say, 60 employees gathering and analyzing information manually, there will be only 10 left. Such level of automation will give businesses considerable perks, i.e., increased productivity and resource saving. One would say that an obvious downside is job elimination and machines taking over from humans in many areas. Far from it, actually. According to a World Economic Forum report, AI could cut 75 million jobs, yet open up 133 million new opportunities by 2022.

With the view to ensure a smooth integration of OCR into traditional document management processes, it is imperative to prepare employees for an innovative approach. The following measures can help show a responsible attitude on the part of top management and guarantee a positive impact on business processes:

  • start with adopting an AI-enabled system providing a low level of automation
  • create a user-friendly interface
  • engage employees and persuade them to welcome changes
  • introduce further automation gradually

OCR is already ripe to be successfully used for automatic document processing. It can help dramatically save time and resources spent on digitizing paper documents, although the scope of automation depends on the type of business, it’s specific requirements and readiness for AI-driven changes.

So far, the efficient employment of the technology is uncertain, if lack of human involvement. The more specific the case of using OCR is – e.g., legal or economic fields, – the more human input is required. Machines are created to be a helping tool to assist humans, both in some sophisticated tasks and in day-to-day ones.