Optical Character Recognition (OCR) is a key technology in automating Accounts Payable processes. It’s a technology that most departments are familiar with and that many leverage in some way (either themselves or indirectly through a service). Because OCR can capture (and digitize) information off of invoices and other purchasing documents in almost any format, it is commonly used to manage documents from suppliers that insist on sending invoices in paper or email PDFs. The information OCR captures is pivotal because once it is in an electronic format, it is checked for accuracy against other documents related to the purchases and sent for approval to management. While this sounds good enough on the surface—and this is not to deny its usefulness in a myriad of scenarios—in reality OCR has its shortcomings. This leads us to question whether OCR is truly automation at all or just a short-term solution to a larger issue.

OCR is unreliable and still requires a great deal of human intervention. In our analyst briefings with AP automation software providers, Levvel Research has observed a popular trend: almost every provider will proudly tout a 99% to 99.9% accuracy rate for their OCR technology. Seems impressive, right? The devil is in the details, though. The question that bears asking is how these companies measure and define accuracy. Common sense would lead you to believe that with a 99% accuracy rate, 1 out of every 100 invoices would have an error. Depending on your invoice volume, this does not bear any serious concern. Small organizations could reasonably go weeks without having an invoice with an error. However, most solution providers measure their accuracy by character count, not invoice count—this means that 1 out of every 100 characters would be read incorrectly. This could very well mean that there’s an error on every single invoice. This is why even the most advanced mailroom service providers have humans doing manual verification for accuracy, and most AP solutions with built-in OCR still require the user to check exceptions and manually correct them.

Machine learning and artificial intelligence improve OCR significantly, as the software “learns” the format of specific vendor’s invoices—but this takes time and requires human intervention to teach the software where to find vital invoice information. Even with this capability—OCR can only improve so much—especially when a business expands its supplier list to new and/or overseas vendors that may have complicated, new invoice formats with different languages and currencies. This also doesn’t even take into account the one-off purchases that will throw unknown formats at OCR, temporarily crippling the machine learning.

True automation in invoice management looks different than the OCR-backed process. Ideally invoice receipt would be completely touchless, and when used, purchase orders are created electronically and flipped into invoices, and then invoices are automatically approved. EDI and XML make this a realistic option for many companies dealing with the appropriate technology and suppliers who are technologically capable. For invoices sent via paper and email PDF files, OCR should be seen as a temporary solution while suppliers update their process enough to send documents in a true digital, eInvoice format. To that effect, Levvel Research predicts OCR will fall out of favor more and more in the upcoming years as organizations stop treating it like a permanent solution and more like the band-aid it is.

That being said, OCR is still a valuable tool within almost any AP solution, especially when the tech is leveraged with AI and ML to increase its accuracy rates. We see it as a must have in leading invoice management software (even if it’s only an optional feature)  —  at least until we see true AP automation fully saturate the North American back office.