Image recognition and understanding

Intelligent document analysis “recognAIze”

With the intelligent document analysis of recognAIze, data from documents can be recognized and evaluated automatically.

Where does the AI application offer the greatest benefit?

The manual review of receipts, invoices and other documents, their digital capture and provision is associated with a high expenditure of time and money in many companies as well as administrative institutions. The solution is provided by intelligent document analysis systems based on optical character recognition (OCR), which, like “recognAIze”, enable fast, simple and automated analysis as well as blind processing of all types of documents. Thanks to artificial intelligence, documents are automatically captured, read, assigned and further processed. Damaged originals, low-quality scans of documents and, in particular, confidential documents are processed without further human intervention and according to high data protection standards.

What are the quality indicators in these types of AI applications?

  • The basis of document analysis is the input data that needs to be analyzed. Since the documents are usually captured in varying image quality, automated image enhancement is very important in the AI application.
  • AI-based optical character recognition (OCR) using artificial neural networks ensures that not only individual text characters are recognized and processed, but also text passages and the structure of a document (e.g. headers or footnotes).
  • Through layout analysis, the AI application can also identify tables within a document and interpret the contents to process invoices automatically in accounting, for example.
  • Particularly in the case of sensitive information, the AI used must be secure and all data must be processed in a DSGVO-compliant manner on German servers or on-premise at the customer’s premises.
  • In the future, handwriting recognition (ICR) will also play a role in the applications in order to open up additional fields of application and achieve a complete transfer of content.

What AI technology powers demonstrator of KI.NRW?

Deep Learning OCR

Optical Character Recognition (OCR) combines Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM), two current approaches in the field of Artificial Intelligence, to develop characters from pixels. It extracts the text from the images and generates a structured XML file for each document with position data of the recognized words and page ranges.

Image enhancement

For the best possible quality of the results, negative influencing factors such as insufficient exposure of the scanned document or curvature or distortion in the image must be equalized. The image enhancement algorithms perform grayscale conversion and binarization for this purpose. In addition, procedures are used to remove curvature and other disturbing factors.

Layout recognition

The layout recognition identifies the structure of text and helps to divide the recognized characters into columns, text sections or headings and to determine a reading order. In this way, table structures can also be recognized and output again as such, e.g. as a csv file. The output format is provided with appropriate metadata.

“Thanks to the methods used for image enhancement, layout and
character recognition, even poor quality documents can be evaluated.”
Dr. Nicolas Flores-Herr
Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS

What does the AI demonstrator show?

The KI.NRW demonstrator “recognAIze” brings AI-supported document analysis to life. Via the application, your own photographed or scanned documents can be uploaded to the system, where they are available for testing the intelligent document analysis. The step by step animations guide you through the AI technologies used in the demonstrator.

Optimize documents

Often, photographed or scanned documents have a fluctuating image quality, and are sometimes bumpy, torn or dirty. Image enhancement processes ensure that even old or damaged documents can be processed. The “recognAIze” demonstrator vividly guides you through the range of optimization options that are essential for high-quality document analysis.

Recognize characters and structures

The accuracy and speed of the OCR engine for intelligent character recognition from “recognAIze” is higher than that of leading market players. Without templates and manual post-processing, the demonstrator recognizes document layouts, e.g. sender information or dates. Even complex text content such as text-around-image elements are reliably recognized by the application.

Understand tables

Tables pose a particular challenge because they can be structured differently from document to document. AI methods are responsible for subdividing table contents according to information types and interpreting the segments individually.

Classify content

The demonstrator “recognAIze” determines the properties of the document, evaluates the individual elements and thereby enables a whole range of subsequent processing. For example, the intelligent classification makes blind processing of confidential documents possible in the first place. This means that information can be aggregated or used pseudonymously without a human being having access to the documents. In this way, sensitive, personal data can be protected better.

Create interfaces

AI-supported document analysis is often at the beginning of a process chain, whether in accounting or in archives. To enable further processing steps, the KI.NRW demonstrator offers various output formats such as XML or PDF.

Are you curious?
Click here to go to the demonstrator!

Where is more information to be found?

Study by KI.NRW

Learn about where we encounter modern language technologies in our everyday and professional lives and the economic opportunities (only available in German language).

AI products “Made in NRW”

Filter our AI map by “language and text comprehension”.

AI provider from NRW

Our AI map illustrates who offers AI methods related to image recognition in their portfolio.

Contact us

Dr. Nicolas Flores-Herr

Business Unit Manager Document Analytics

Fraunhofer IAIS
Schloss Birlinghoven
53757 Sankt Augustin

Phone +49 2241 142532

Email schreiben

Dr. Iuliu Konya

Senior Research Engineer

Fraunhofer IAIS
Schloss Birlinghoven
53757 Sankt Augustin

Phone +49 2241 142543

Email schreiben

 

Marius Nißlmüller

Student assistant Business Development

Fraunhofer IAIS
Schloss Birlinghoven
53757 Sankt Augustin

Email schreiben

To the top