That’s an excellent question. I was actually planning on writing a blog article on that topic at some point, but I wasn’t sure it would stir much interest in the community…
So the idea is to loop until the progress ratio reaches 100 (or “done”), but without flooding the server with requests for progress updates. The idea here is to introduce a delay between each request, but the trick is to automatically change that delay value in order to minimize the total number of calls, while also trying not to delay the operation uselessly.
Here’s a (relatively) simple way of doing this :
function monitorDMProgress(operationID) {
var progressNow = 0;
var progressBefore = 0;
var sleepValue = 300;
var totalCalls = 0;
while(true) {
progressNow = httpGetRespText("/rest/serverengine/workflow/datamining/getProgress/"+opID);
totalCalls++;
if(progressNow==="done" || progressNow===100) break;
if(progressNow===progressBefore) sleepValue*=1.5;
// Update some log or status file here
logger.warn("Progress :",progressNow,"/ Calls :",totalCalls,"Waiting :",sleepValue);
progressBefore = progressNow
Watch.Sleep(sleepValue);
}
}
This function receives an operationID and monitors the progress of that operation. It initializes a loop and calls the REST endpoint. In the example above, the httpGetRespText()
method is just a wrapper around XMLHTTPRequest.open("GET",...)
that returns the responseText
value received from the server.
If the result from the endpoint is 100 or “done”, then the script exits the loop. If not, it checks if the progress value has changed since the last call. If it hasn’t, then it increases the delay between each call by 50%. So from a default value of 300ms, the script increases the delay to 450, 675, 1013, etc.
Important: the value is only increased when the progress does not change between two calls. Otherwise it remains the same.
Now this technique is imperfect: it takes for granted that the data mapping operation is linear and that all records take just about the same average time to process. You could implement a more robust method to better determine the precise delay between each call by storing the various progress values, along with the total time spent so far, and then figure out the best value for the next delay. That would require some much more complex algorithm, obviously.
Those more complex algorithms are exactly what the progress bars in Windows use when you install new software… and you know how imprecise those can be! … So I think that my method, while imperfect, gives you a pretty good overview without being too involved, and without flooding the server with requests.