• PDF

Separation of single articles

The separation of single articles combines the article elements of the XML-file by logical, typographical and semantic analysis according to the text flow. The images are assigned to the corresponding articles.

Einzelartikel

Representation of a control page after the layout recognition for the article seperation. Each article is displayed in colour, the connection lines show the correct textflow.

 

Each article is saved in a separate XML-file and linked to the source file. The articles are an ideal basis for accurate search results with PPS-Finder. We don´t have an unchangeable system, we always configure our system according to the customers´ requirements:

  • customers specific output (e.g. XML, database, CMS)
  • special adaptions for images and spreadsheets
  • continued articles on several pages are joined together
  • detection of "non articles", like advertisements