I am working on project where I will be receiving an unknown number of small print jobs in a “batch”. Each will be a single invoice, ranging from 1 to maybe 20 pages, and there will be somewhere between 1000-3000 invoices.
I want to receive and concatenate the entire run of them, so that I can convert it to one big PDF which I will archive and then split using metadata into 3 files to print based on postage and/or envelope/inserter capacity.
My question is whether there is good way to determine that all of the print files have been received, in order to know when to kick off the next step in the process. I have worked through something like this in the past, where I came up with a kludge that involved incrementing a global variable each time a new print file came in, and then checking the value of the variable every x seconds. If the value had changed, then loop, and if not, move on to the next step. It worked, but it felt very delicate and I never completely trusted it.
I believe it’s safe to say that if no new files come in for 10 seconds, the job is done. But there is nothing that PlanetPress will receive that would tell it how many files to expect.
You could implement something similar to your counter but instead, use the timestamp of a file.
Each time you receive a job, store the timestamp of this file in a variable after it’s processed.
Then, in another process that runs every 10-20 or 30 seconds, compare this timestamp to know how much time elapsed since the last job, then you can trigger the creation of the big PDF. You’ll need another flag to know that the big PDF creation is started/running so it’s not started many times.
Ideally, the program which sends the small jobs would print a “final” page which could easily be analysed and trusted as the end of the job.
Sadly, the “final” page just isn’t going to happen. This process is replacing their current “solution”, which is just dumping all of the print files onto a printer - when the printer stops, the job is done…
A couple of questions on your suggestion: When you say “Store the timestamp of this file in a variable”, Are you talking about saving the server time - %h%n%s? How would you do the time comparison? It seems to me that if I do a straight string comparison, the current time will always be greater than the value of the variable, and if I want to add a value to the variable I am going to have to write a script. Are you aware of another way? That would really help me out.
Time comparison would need to be done with a script, since a text comparison may not return the proper result. The string-encoded date could be converted to a proper datetime value, and then compared to the current datetime.
Instead of a variable, you could also keep the path of the most recent processed file and then just use the timestamp of that file.