Make a data extraction to a txt file that depends on channels

aguzman · August 9, 2016, 5:16pm

How do I make data extraction base on channels for example:

where the first character its the identification of the line or lines at that level, and as such it can be repeated on the long of the file.

Yxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxx

Zxxxxxxxxxxxxxxxxxxxxxxxxxx

Txxxxxxxx xxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx

xxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx

xxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx

Phil · August 10, 2016, 12:08pm

Hi,

Your question is a little difficult to answer without seeing the actual data, but I can try giving you a few pointers.

Usually, these types of data files include a channel that delimits where each record begins.You should use that marker to define boundaries in your Data Mapping config.Let’s say your data is similar to the one below:

1NewRecord
2John Doe
2111, Main Street
2SomeCity, SomeState
2SomeCountry
2SomeZip
3123456 First product            10$     4        40$
3987651 Second product           22$     3        66$
4Subtotal: 106.00$
4Taxes: 15.90$
4Total: 121.90$
1OtherRecord
2Jane Smith
2333, 1st Avenue
2Suite #236
2SmallTown, SomeProvince
2SomeCountry
2SomeZip
3123456 First product            10$     4        40$
3654328 Third product             6$     1         6$
3837463 Fourth product           12$     2        24$
4Subtotal: 70.00$
4Taxes: 10.50$
4Total: 80.50$

In this example, digit “1” in the first column indicates a new record. So you would first highlight that “1” digit in the Text Viewer, then in the Settings panel you would set the Page delimiter type option to On text. The DataMapper will automatically fill in the other properties according to your Selection in the data. With the above sample data, you would now have 2 records.

Let’s say you then want to extract the detail lines, which are identified in the sample above with channel “3”. You would highlight the first “3” in the record and click on the Add Goto step button. Set its properties to “next occurrence of” and the rest of the properties will automatically be filled according to your selection in the Text Viewer. Make sure you untick the “inspect entire page width” option so that the Goto Step only inspects the first column. Then you can add a standard loop that repeats while that first column is a 3.

Hope this is helpful.

aguzman · August 10, 2016, 12:23pm

Thanks for the Help,

Just one thing, what happen or how to proceed, if the first caracter (channel) marks the begining of a parragraf with transactions details?

for example:

1NewRecord
2John Doe
 111, Main Street
 SomeCity, SomeState
 SomeCountry
 SomeZip
3123456 First product            10$     4        40$
 987651 Second product           22$     3        66$
4Subtotal: 106.00$
 Taxes: 15.90$
 Total: 121.90$
1OtherRecord
2Jane Smith
 333, 1st Avenue
 Suite #236
 SmallTown, SomeProvince
 SomeCountry
 SomeZip
3123456 First product            10$     4        40$
 654328 Third product             6$     1         6$
 837463 Fourth product           12$     2        24$

5 card No. XXXXXX-XXXX-XXXXX-XXX jhon doe

3123456 First product            10$     4        40$
 654328 Third product             6$     1         6$
 837463 Fourth product           12$     2        24$


4Subtotal: 70.00$
 Taxes: 10.50$
 Total: 80.50$

As you can see in a page the occurrence of the channel can be repeated and in between can come another channel and after that continue with the detail channel.

Sorry for the log post, but this proyect its very complex to understand with the connect logic, its a desing in PP V7 that i need to migrate to connect.

in PP V7 I do this with Presstalk and its very easy, but here I cant find a way to do it

Thanks again for the help.

Phil · August 10, 2016, 1:09pm

Your loop that processes the detail line should simply step through each line “until 1rst character is not empty”. So if it hits a new character (e.g. 4), then the loop will stop.

You will probably need to implement 2 nested loops: one that goes through every single line and that uses conditions to check what the first character on that line is, and then whenever you find a 3, you implement a nested loop as described above.

Also, your comment “in PP V7 I do this with Presstalk and its very easy” is a bit funny: PressTalk is anything but easy, it just happens to be a language you have become comfortable with over the years because you had no other choice in PlanetPress. I should know, I designed and wrote part of it…

Once you get more familiar with Connect, I think you’ll understand what I mean

aguzman · August 12, 2016, 11:47am

Thanks a lot, that works, Ill give connect a try.

farid.fakhri · December 18, 2024, 8:33am

Hi Phil,
This was very useful and worked very well, I have a question to how we get fields inside the line, I tried to add an Action step after Extraction step, inside the Repeat loop, but it didn’t work or my script was not right, do you have any idea?
Here is the script I used:

// Access Field1 from the first detail record
var field = record.detail.Field1; 

if (field && field !== "") {
    var delimiter = "|"; 
    var parts = field.split(delimiter); 

     if (parts.length > 0) {
        for (var i = 0; i < parts.length; i++) {
            var fieldName = "Part" + (i + 1); 
                      record.fields[fieldName] = parts[i] || ""; 
        }
    } else {
        record.fields["Part1"] = "Error: Field1 could not be split.";
    }
} else {
       record.fields["Part1"] = "Error: Field1 is empty or undefined";
}

Thanks,

Phil · December 18, 2024, 12:10pm

There are a few issues with your script, but before attempting to fix them, could you explain what you’re trying to achieve?

From what i can tell, your data mapping process has already extracted a piece of data and stored in record.detail.Field1. Then, your script reads back that value which may contain multiple parts separated by a pipe signal (|) and you want to store each part in its own detail field (record.detail.Part1, record.detail.Part2, etc.).

Do the fields Part1, Part2, Part3, etc. already exist? Because if they don’t, then your code cannot work. But even then, it’s probably not the right approach either. I would personally use a single JSON field (which did not exist yet when I wrote my previous post in this topic).

In the attached example, I am extracting each detail line to a JSON object (an array, in that instance). This requires at least version 2022.1.

cs.OL-datamapper (4.8 KB)