Extract detail rows for orders where they are not in sequence

Hi

I have a csv data file of orders, each row of data contains the items for all the orders, however it seems that the items can get split and the items correlating to an order are not concurrent in the file. Below is an example:

Order NameProduct NumberProduct NameQuantity
__ORD0000049192____PRD00000616____Book____1__
ORD0000049195PRD00000615Leaflet3
ORD0000049196PRD00000501Bookmark5
__ORD0000049192____PRD00000420____Pen____10__
ORD0000049193PRD00000202Calculator2
How can I ensure I group all items correlating to that order into a detail table, I see the [linear blog post](https://learn.objectiflune.com/howto/create-detail-table-linear-data-structure/) thrown around a lot but this is as the name sugests a linear data file where the order items are all on the same data row and not what I am after, clearly.

Any help would be appriciated, Ideally I want to sort the data first in datamapper but this is not a feature of the software for some reason, which I find odd as Im sure I am not the only use case for this!

Thanks
Liam

1 Like

You could re-arrage you data using Workflow’s metadata prior to your Datamapper.

In Workflow:

  1. Use Change Emulation plugin to read your CSV file as line printer. 1 line per page.
  2. Use Create Metadata plugin with the document set to none. That will generate a metadata file of 1 document and as many datapages as their is line in your CSV.
  3. Use Metadata Fields Management plugin to add 1 field as so:
    1. Level = Datapage
    2. Field name = Line
    3. Field value = The first line of your CSV file
  4. Use Metadata Sorter plugin to sort on the Line field
  5. Use Metadata Sequencer plugin as so:
    1. Metadata Level = Data page
    2. Sequencing is based on… -> The following number of occurences of the level = 1
  6. Use Set Jobs Infos and Variables plugin to set the jobinfos %1 to the content of the metadata field Line
  7. Use Create File plugin and add %1 in it
  8. Use Send To Folder plugin to output to a folder with a unique filename. Make sure you check the option Concatenate files, leaving the Separator string field empty

Hope that helps.

If you’re into scripting, you could accomplish this fairly simply in the pre-processor, I’m sure.

However, if you prefer not to script, I can think of another way to accomplish this. Open it as a database.

Now, this could be done in the datamapper itself if you wish, just be sure to install the 64bit Access Database Engine redistributable from Microsoft. Unfortunately, in it’s current iteration, your options for dynamic ODBC connections are limited in the datamapper, though I hear good things are planned to spruce this up in the next version. Still, a simple static query could do the job here:

SELECT * FROM ‘unsorted data.csv’ ORDER BY OrderName

When doing it this way, you’d just need to be certain that your data file is always named the same way and stored in the same location. Easily done with the workflow.

Alternatively, the workflow itself has a Database Query plugin which would allow you to read the CSV, sort it, and output a new sorted CSV file. This amounts to the same thing, but with the added benefit of being able to make the query dynamic (ie, point to a different file name).

Now, I should say, this will work on a small scale, but I have my doubts about how well it would scale up. Ultimately, I believe you’d want to resort the data via the preprocessor itself. I imagine the easiest way to go about it would be to convert the CSV file into an array, sort it with array sorting methods (JavaScript Array Sort), and then write it back to CSV.

Hi Hamelj

Thanks for this, however this sorts the entire row of data and doesnt sort by a single field, i.e. Order Name, unless I am missing something?

Thanks

Liam

I can script it but was trying to avoid it if I could and hoping there would be a more “off the shelf” solution I was missing!

Based on the data example you provided, yes it will. But that should be what you want as a result since the field you are looking for, is the first one. Since the ordering is alphabetical, you will have the proper result.

It will order each line while looking at it as a long string. So as long as your order name is the same, they will be concurrent.

If your data is somewhat different, then the logic might need to be revisited…