Extract Data from PDF within Workflow

Hi

Is there a way through a workflow script to open a PDF and extract data from the 1st page of the PDF so we can then add to metadata of the main document?

James

Hi @jbeal84,

That is possible by for example executing the following steps:

  1. Select a PDF document as data file by going to: Workflow Configuration application > Debug > Select.
  2. Add a Set Job Infos and Variables Workflow plugin to your Workflow process.
  3. Select a Job Infos or Variable by clicking in the first cell of the Var/Info# column.
  4. Right click in the first cell of the Value column.
  5. Select the option Get Data Location… to extract data from the PDF data file.

HI Marten

Sorry don’t think I explained fully. I need to do this from a script. I have a XML input and need to add extra data to this XML from content of a PDF, so I wanted to have a data mapper which reads in the XML and then for each record in the XML open a PDF, extract info and add it to the record. So this could either be in a workflow script or within the mapper reading the XML maybe

James