capture from pdf dynamic lines record for details

rene · January 6, 2016, 12:01pm

Hi I am trying to capture a detail table but some records have 2 lines other 5 and I can create a correct record detail for example

https://learn.objectiflune.com/qa-blobs/12181656585927304217.pdf

Phil · January 7, 2016, 3:12pm

Hi Rene,

There are a number of ways of achieving this, we’ve done it several times with Connect. It would be easier to demonstrate if you could provide me with a sample PDF file that I could use to build a basic data mapping configuration, which I could then post in response.

rene · January 7, 2016, 11:33pm

How I Send you sample pdf this platform don,t allow to send pdf

Phil · January 8, 2016, 10:18am

Yes you can, just click on the link icon in the editor toolbar and you will have the option to upload any file, as you can see from this sample: https://learn.objectiflune.com/qa-blobs/17989878714904170230.pdf

rene · January 11, 2016, 10:41am

Here is the sample pdf

https://learn.objectiflune.com/qa-blobs/10594745486571234729.pdf

thanks

Phil · January 11, 2016, 12:32pm

Attached (https://learn.objectiflune.com/qa-blobs/11109223797432418483.ol-datamapper) is a sample data mapping configuration that extracts multiple lines for each description. The loop simply moves from item to item, “remembering” its last position by using a sourceRecord property. Then, the description is extracted by specifying an offset corresponding to the difference between the current position and the last position.

To move from item to item, I used a regular expression that matches the Line Total format. This technique prevents the datamapper from having to inspect each individual line and using conditions to determine whether a new line item was found.

Note that because of the wide gap between each line in the description, the DataMapper adds extra blank lines in between each actual line; I remove those extra lines by specifying a replace() post function in the extraction code, making sure that only one <br /> tag appears after each line.

Let me know if that helps.

rene · January 14, 2016, 11:44am

Thanks it works great

I Have an additional question if the pdf don,t take the last line because a $ symbol is not present in subtotal

https://learn.objectiflune.com/qa-blobs/15236512191779725993.pdf

Phil · January 14, 2016, 12:54pm

Here, I changed the loop condition slightly and added an extract step at the bottom that extracts the total/tax/subtotal values, regardless of whether there’s a currency sign in front or after the text in the data file. You can switch from one data file to the other and see that both work as expected.

https://learn.objectiflune.com/qa-blobs/6076315609460852683.ol-datamapper