A customer will be sending us 1000’s of PDF files that require a number of steps to process, merge and concat. I have most of the steps figured out with the exception of merging PDF’s for ‘householding’.
Ex. customer will send us PDF files with file names of [parentacct#_date].pdf and [childacct#-parentacct#_date].pdf. The PDF’s that have the same ‘parentacct#’ need to be merged so that there is a single mailing to that household.
To further the example,
PDFA = (123456789_20230101.pdf) is the parent PDF and has 4 pages
PDFB = (987654321-123456789_20230101.pdf) would be the child PDF and has 6 pages
PDFA would have Pages 1-4 from the original (PDFA) and pages 5-10 from the child PDF (PDFB).
Also, there could be more than 1 child PDF that needs to be merged with PDFA so the script would need to loop until ALL child PDF’s are merged with the parent PDF.
That would be a mix of Workflow plugins and scripts.
I’d start by assuming that the whole process would run once all files are received.
Folder Capture plugin which will look to a dummy folder for a dummy file to trigger the whole process
Branch
Folder Capture plugin
a. Mask using Regex to only capture parent PDF
Move all parents to a PARENTS named folder using Send to Folder plugin
Back to Main Branch
Folder Capture plugin to capture all remaining child PDFs
Script that:
a. Read the name of the child PDF
b. Extract the parent name fom the child name
c. Using the Alambic API’s IPages Method Insert From, insert the child PDF pages into the parent’s PDF
d. Close and save the parent PDF
Actually, this can be made even simpler by using the Send to folder task’s ability to automatically concatenate PDFs together:
Here is that sample process in a workflow configuration file: Merge.OL-workflow (21.6 KB)
The only script that’s needed is 2 lines long: it stores the parent file name and prefix in local variables that are used later on to pick up the child files.