How text recognition works with AI and what artificial intelligence can do in document and text recognition.
In order to check and continuously improve the quality of our AI-supported document and text recognition, we have created large test vectors. These can also be used to precisely analyze the individual steps involved in preparing documents (learn more about this here).
Submissions from the smartphone camera
When customers are asked to submit photos of documents themselves, there are often surprises: Distortions, bent or crumpled paper, blur, rotations, etc.
In the following, we would like to show you some particularly adventurous shots, which we of course took ourselves for testing. For this purpose, we printed out a publicly available contract template from BVAEB (Versicherungsanstalt öffentlich Bediensteter) and digitized it again with our smartphone in a particularly creative way.
Preparation with Artificial Intelligence
In each of the images you can see on the left our cell phone photos. On the right you can see the result from our pipeline with several AI networks that have corrected the image as best as possible. This version can then be run through the actual text recognition process. And obviously with much better results than from the original photo.
Our pre-processing manages to tighten the text and get it into straight lines – a very important preparation, so that the text can be subsequently.
Even very extreme folds in the paper can be mastered.
Two or more, as well as horizontal and vertical kinks are also “ironed out”.
Two pre-processing steps are relevant here: The crumpled paper must be overcome, but also the bending of the entire sheet. As you can see, this works surprisingly well. The shadows of the paper do look more intense (as we increase the contrast). However, the text recognition can handle this well.
This is the absolute extreme case: That customers send in their papers like this is hopefully the exception. Nevertheless, the lines of the text can still be restored quite well. The font itself becomes italicized, but that is a snap for the text recognition afterwards.
Again, the curved page results in an italicized, but very readable, new image.
And here’s a second extreme case: We clearly overdid it, because this attempt to photograph a leaf is a joke.
Still we were very pleasantly surprised by what our AI conjured out of it.