We have a longstanding issue with the “server” engine not releasing memory. If we do not restart the Connect Server after large jobs, the next jobs will be produced with blank pages. Support has tried looking at this, no resolution at this point. Our standard practice is to reboot the VM/OS every day at 2:15pm so the next large overnight jobs won’t crash.
Our jobs keep getting larger and larger. I’m at the point where I need to split the incoming data files into smaller batches (40k records/batch), and run each batch separately, rebooting Connect Server between batches. This isn’t optimal; I shouldn’t have to “babysit” jobs meant to run automatically.
Until/unless this bug is resolved, though, I have to consider workarounds.
If I had a powershell script that restarted Connect Server, and executed that script via the “External Program” plugin, would that “crash” Workflow? The idea is that, as the final step of a Workflow Process, restart Server, check (somehow… maybe code the pshell script to return an error code that Workflow could check…) that Server is running again, before looping back to capture the next batch of data.
What would be the impact on other running Workflow Processes? My guess is that if I restart Connect Server while a Connect task is running, that job would be killed. I would need to ensure that my large job was the only running job.
I have found the ticket (1464253) related to this issue you are describing. Our R&D Team was involved and it was decided that you preferred rebooting the server and agreed to close the ticket.
Please open another ticket and refer to this ticket and we will gather all the logs and files needed and involve our R&D team again.
Hopefully we can help find a more manageable workaround/solution if possible.
I went down the route of sending log files, several times, as well as the contents of the filestore, altering the settings in the server.ini file, etc. I didn’t PREFER to reboot; it was the only viable workaround since a solution was never found!
Hi @TGREER,
I once had this issue but has been long since it has happened. What I remember having to do was to split the records into batches, where the number of records were not too large and not too little - just the correct batch sizes. You might need to test the correct number of batches as each use case can be different.
Splitting into batches is only a partial solution. Connect Server doesn’t release memory between batches, requiring a manual reset. This is a bug.
A batch of 40k records requires about 2 hours to run, and we frequently have files with over 250k records. I have to spend all day, several days in a row, spoon-feeding a process that should run automatically.
Since Support and R&D couldn’t resolve the issue (multiple calls, multiple batches of log files), I had to resort to scheduled reboots once a day. That workaround is no longer viable, and without a bug fix for the Connect Server memory issue, I need to find a new workaround.
This is true for all the engines, but not for Connect Server itself. There’s no GUI for the memory allocation for Server, it has to be made in the server.ini file, and there’s no automatic restart setting for Server.
All the log files and history should be attached to that original ticket, so couldn’t you simply re-open this internally? In fact, I thought R&D would continue to work on such a serious issue. The impasse we reached was that I couldn’t (and still can’t) let a large, important, production job intentionally fail just to provide additional log files. I needed a workaround immediately, and daily reboots were the only thing that seemed to mitigate the problem. That doesn’t mean the issue was resolved or that R&D should have stopped researching.
There is a memory-related issue with Connect Server that causes pages not to render (they appear blank in the PDF output) near the end of large jobs. The log files showed “out of memory” errors and “low memory” warnings. Changing the memory allocation in server.ini helped, but after large jobs runs, Server still “locks up / doesn’t release” its memory so the next job has a high chance of failing.