OCR & Layout Analysis

When digitizing newspapers, OCR does not recognize all the elements the correct form. Some columns are not recognized at all (runs). This requires a correction of the OCR result by our layout correction.

Layout correction

As text recognition tool we use FineReader. The layout and column analysis is optimized with additional software developed by PPS. An optimal column and article recognition is especially important for the following automatic article recognition (AAR). The column analysis is significantly improved by our own software (Corrector).

FineReader Layout

FineReader Layout

PPS Layout

pps_layout