Find the last non-blank row in a record

Hi folks,

I’m having a few problems setting up a robust DataMapper for the data we’re running through Planet Press. The short version is, the raw data represents invoice information that we’re collating into nicely formatted statements for our customers. The raw data files themselves are reasonably well structured and consistent apart from one area - the end of each record and the end of the report.

Generally speaking, pagination in the raw document is denoted by the FF character. The last four lines in a record vary however, and I need a way of robustly handling the possible scenarios that could exist. Let’s say we have a record with 90 lines. Counting backwards from the very last line:

  • Line 90 would contain just the FF character.
  • If the customer has requested their statements via email, line 89 will contain the text “#email” to denote they are an email customer (which we then use elsewhere in the DataMapper), OTHERWISE, line 89 contains the totals for their statement.
  • If the customer has requested their statements via email, then line 88 will contain the totals for their statement.

As a final wrinkle to this, if the record is the last one in the document, there will be no final FF character, the document just ends. So, in this instance, the last line would contain either the “#email” text or their totals for the statement.

Sorry that this is a bit of a big ask, but I’m really struggling to get my head around how to build a robust process in DataMapper that can handle any of those scenarios. I did wonder if it might be possible to tell it to look for the last row in a record that contains numeric data, assume that those are the totals and go from there, but even then I’m not sure how best to implement that.

If anyone is able to suggest how I could approach this I’d be most grateful.

All the best,

Rich

Since you will have (or not) a keyword “#email” at the end, then you could use it.

While reading your lines in a loop (a repeat until no more element), you add a condition that looks for the ‘#email’ tag on your line. If it finds it you act a way (data extract or otherwise) if it doesn not, you act another way (again data extract or otherwise).

If you have the ‘#email’ tag, you could put it in to an extraction fields AND extract the total from the previous line into another field (total) but both in the same extraction steps.

If you data is not sensitive, posting it here would help give you proper directions. If it is sensitive, I suggest you open a technical support call at 1-866-348-5863 or through our website.

Hi Hamelj,

Many thanks for your reply, I think that’s given me what I need. I’ve set up a logic split as you describe, and then two different ways of processing the lines depending on whether the ‘#email’ tag is present or not - that appears to do the trick!

All the best,

Rich