Extracting Text Data

Hi,

I am getting the error below after all records have been extracted.

WARN [22 Mar 2016 13:35:57,709][DataMapping thread] com.objectiflune.datamining.text.internal.engine.StepExecuter.runSkip(?:?)[COMPONENT=Data Mapping][SOURCE=Goto123] The step is out of bound (DME000130)

I am using a delimited text file which ends with the delimiter, /f. The file only has 100 records but the mapper shows 101 and I get the error when printing all or viewing the last record.

Is there a way I can use a script to stop reading records if there is no more data after the delimiter?

Could you provide us with a small sample data file? Should be fairly easy to fix.

How do I attach a file here, can’t see any option for that?

Data below, this is 2 records replace [FORM FEED CHARACTER HERE] with \f, this data shows 3 records but should only show 2

W001 PmlTest.ptk
W002 FalseA POST
W003 PML Test 00204000.pdf
W004
A001 1149783813
A002 054
A003 Mr A Smith
A004 GPO BOX 1
A005 HOBART TAS 7001
A006
A007
A008
A009
A010
A011 1
B001 00000010000101
C001 1
C002 1
C003 1.1
C004 1
C005 1
C006 1
C007 1
M001 Simplex
M002 Portrait
M003 A4
M004 75
M005 PMLTheoreticalBaseStock
D001 204000
D002 1
D003 23/07/2013
D004 31/08/2013
D005 01/07/2012
D006 19/07/2013
D007 Farmland
D008 100 White Road Whitville TAS 7124
D009 LOT 1 DP12 EP 20000
D010 $100.00
D011 $200.00
D012 31/08/2013
D013 $400.00
D014 30/11/2013
D015 $800.00
D016 28/02/2014
D017 $260.00
D018 31/05/2014
D019 $1,660.00
D020 204000
D021 204000
D022 00000000204000
D023 *244 000 204000 9
D024 01/07/2013 to 30/06/2014
D025 9.00
D026 2013/14
I001
P001
P002
P007
P003
P005 Farmland Base Rate 1 Base Levy $400.00
P004 Farmland 4 0.0025c/$ $22.10
P004 Rural Waste Fee $60 flat rate $60.00
P006
J001 Newsletter
J006 Information
[FORM FEED CHARACTER HERE]W001 PmlTest.ptk
W002 FalseA POST
W003 PML Test 00204100.pdf
W004
A001 1131080674
A002 054
A003 Mr B Braxton
A004 GPO BOX 2
A005 HOBART TAS 7001
A006
A007
A008
A009
A010
A011 2
B001 00000010000201
C001 1
C002 1
C003 1.2
C004 1
C005 1
M001 Simplex
M002 Portrait
M003 A4
M004 75
M005 PMLTheoreticalBaseStock
D001 204100
D002 2
D003 23/07/2013
D004 31/08/2013
D005 01/07/2012
D006 19/07/2013
D007 Residential
D008 101 White Road Whitville TAS 7124
D009 LOT 2 DP12 EP 20100
D010 $200.00
D011 $200.00
D012 31/08/2013
D013 $400.00
D014 30/11/2013
D015 $800.00
D016 28/02/2014
D017 $260.00
D018 31/05/2014
D019 $1,660.00
D020 204100
D021 204100
D022 00000000204100
D023 *244 000 204100 9
D024 01/07/2013 to 30/06/2014
D025 9.00
D026 2013/14
I001
P001
P002
P007
P003
P005 Residential Base Rate 2 Base Levy $400.00
P004 Residential 10 0.0045c/$ $90.00
P004 Rural Waste Fee $60 flat rate $60.00
P006
[FORM FEED CHARACTER HERE]

Click on the chain link in the Editor Toolbar then select the “Send” tab in the dialog: that allows you to upload a file to the server.

As for your data, you should just select the “W001” string at the top of the page, then in the Settings pane change the Page Delimiter Type from On Lines to On Text. A new document boundary will be set each time that string is found in that location, regardless of where the FF characters are.

Thanks, that worked.

Our older data files did not start with W001 but we have not used those for a few years.

With planet press 7 we use the form feed character as a delimiter so just wanted to use it in connect for consistency.