Free-form recognition of the 3rd generation

3rd Generation OCR/Free-Form Recognition

| | Director Business Unit E-Invoicing/SAP&Web Process, SEEBURGER
3rd Generation OCR/Free-Form Recognition

Increase the efficiency of incoming invoice processing with modern solutions and a sound database

Free-form recognition of the 3rd generation simplifies and harmonizes the incoming invoice process sustainably. The potential of a modern OCR recognition solution is even better if the organisation and processes of the incoming invoices are restructured and automated by means of workflow. An adjusted master data basis is of fundamental importance in this context.

It is impressive how quickly and with what quality a modern OCR recognition solution recognizes invoice data from image-based invoices. However, it is short-sighted to assume that the use of an OCR recognition solution with a high recognition rate alone leads to a significant increase in efficiency of the entire invoice receipt process. After all, even the best OCR recognition solution makes mistakes – and if an invoice document has to be manually post-processed because of an OCR correction, then it makes hardly any measurable difference to the processor whether they correct one or two fields. The potential for improvement lays elsewhere.

If you want to make your invoice receipt process more efficient in the long term, you should first take a look at the invoice approval workflow after OCR and review the internal organizational and process structure of your company. After all, the invoice approval process can often be significantly improved if it only has to pass through one or two responsible departments instead of three or more as before. Furthermore, the process flow is greatly simplified if the processors involved are no longer assigned manually, but are determined automatically by intelligent rules in the workflow.

However, whether a primary OCR solution delivers a recognition rate of 83% and a second one of 85% is irrelevant. The statements of the different OCR providers on the recognition rates are not standardized and therefore have completely different meanings (see box).

See what the statement ‘90% detection rate’ can mean:

  1. 9 out of 10 characters are recognized correctly.
  2. 9 out of 10 calculation fields are recognized correctly.
  3. 9 out of 10 invoice pages are recognized correctly.
  4. 9 out of 10 invoices can be processed without any OCR correction.

The perfidy lies in the detail. The basic rule is: recognition rates can only be assessed in the context of the stored ERP master data. If, for example, an incorrect tax number, incomplete address data or poor order data was stored in the ERP master data for a vendor, if the VAT ID is missing or if master data duplicates exist, even the best OCR software cannot correctly determine the vendor.

This is why an adjusted master data basis is so important:

A typical invoice page consists of 3,000 characters. With an assumed OCR character recognition rate of 99%, this means that approximately 30 characters are incorrectly recognized per invoice page. How does this affect the recognition rates at the level of individual invoice fields and invoice pages?

Let’s assume an invoice field (e.g. supplier’s street) consists of 10 characters. This results in a field recognition rate of approximately 90% (0.9910 = 0.9). Based on an invoice page with 15 invoice fields to be extracted (invoice number, vendor name, invoice amounts, etc.), the recognition rate per page is thus reduced to approx. 20% (0.915 = 0.2). This means that based on OCR recognition alone, only one out of five single-page invoices is recognized completely correctly – unless the invoice fields are subjected to additional plausibility checks (mathematical, value range, format, etc.) and validated against the master data, as is usual in 3rd generation free-form recognition solutions.

The decisive question for companies that want to increase the efficiency of their invoice receipt process is therefore not so much the recognition rate of the OCR solution, but rather what the quality level of the vendor master, order and goods receipt data must be to achieve a dark booking rate of more than 50%.

Free-form eecognition of the 3rd generation

OCR results are processed by an interpretation component, the so-called recognition. In modern OCR solutions, this technology is based on a 3rd generation free-form approach. In order to achieve corresponding interpretation results, the free-form approach is based, among other things, on keywords and relations combined with corresponding background knowledge (vendor master, order data), which is provided by the ERP system.


  • No templates:
    In contrast to the previously common form-based recognition approaches, the free-form approach no longer requires the invoice recipient to create vendor-specific invoice templates or the like before the invoices can also be processed by new vendors.
  • Fault tolerance:
    The free-form approach is basically fault tolerant. Non-detected or incorrectly detected field contents do not represent system errors as such, which lead to the termination of the automated processing of the incoming invoice, but are basically possible as ‘recognition blurs’ within the framework of the free-form approach. Error causes can be logic, OCR errors, positioning in the environment of key words, or similar.
  • Learning system:
    Through daily correction and the integrated learning process, the recognition rate of the 3rd generation of free-form recognition is continuously improved.

Recognition Methods


With its products Invoice Portal Cloud Service and the SAP-integrated solution Purchase-to-Pay, SEEBURGER AG, in cooperation with its OCR/recognition partner TCG, relies exclusively on 3rd generation free-form recognition.


Get in contact with us:

Please enter details about your project in the message section so we can direct your inquiry to the right consultant.

Share this post, choose your platform!

Rolf Holicki

Written by:

Rolf Holicki, Director BU E-Invoicing, SAP&Web Process, is responsible for the SAP/WEB applications. He has more than 25 years of experience in e-invoicing, SAP, Workflow and business process automation. Rolf Holicki has been with SEEBURGER since 2005.