Picture Extraction & Facsimile

We automatically cut out editorial prictures/photos from newspapers. The following metadata is generated:

  • Picture ID
  • Picture caption
  • Coordinates of the picture
  • Issue number 
  • Page number
  • Publication date

Metadata can be delivered in HTML or XML according to customer requirements. Advertising images are filtered to 85-90%.

Pages without pictures:

The pictures are removed from the pages so that the copyright of the photographers is not infringed.

Facsimile:

For digiPaper and ePaper, we automatically cut out the articles and pictures and generate the article text in XML format. We adapt the XML schema individually to the customer’s requirements.