How to use a script as a boundary trigger

I’m slowly beginning to make some progress through various features in Connect.

I can see how the boundary trigger on text works but I need a little bit more control so I want to use the trigger on script. The document has a control marker “((n,m))” where m is optional. I need a boundary whenever a valid control marker is found. It’s perfectly practical to write a regex to validate the value but I can’t figure out the structure for triggering on a script.

Could someone help with me the basic syntax?

A lot of Connect still doesn’t make sense and I’m sure some training would help but as I’m the only person working with Connect it’s just too expensive for us.

Hi DougA,

What is the Data Sample that you are using a PDF, CSV, DB, TXT, XML file?

Can you be more specific in what are you trying to achieve for your boundaries?

Best Regards,

Juan

The source files are PDF’s with multiple multi-page documents in each. I’m trying to split the file into a number of documents. The record boundary indicator is a command string in the format “((n,m))”. I believe this was used by an old folder inserter system where “n” is the new envelope command and the “m” is an optional insert marker.

I don’t need to do any other processing, just identify where the next document starts.

Thanks

Doug

Doug,

For PDF’s, your boundaries script should be similar to the following:

// In the line below, adjust the location of the region
// you want to examine. Parameters are (Left, Top, Width, Height)
// and are expressed in mm.
var myRegion = region.createRegion(170,25,210,35);
var regionStrings=boundaries.get(myRegion);
if (regionStrings) {
 for (var i=0;i<regionStrings.length; i++) {
  if (regionStrings[i].match(/__(\({2}n,*m*\){2})?__/gi)){
   boundaries.set();
  }
 }
}

The RegExp will attempt to match ((n,m)) or ((n)) against any of the strings in the specified region and if it does, a document boundary is then set.

Note that you should should try and specify the smallest possible region in order to keep the algorithm efficient, but it would still work on much larger regions if the marker happens to move around from page to page.

2 Likes

Hi there

I’m new at scripting and I’m trying to set a PDF document boundary using the below script. It is looking for either “private and confidential” or “residential security act” in the given area. How can I get it to separate each page as a record. Most records in the document are single pages but some spill onto a second page (this second page, when it exists, doesn’t contain either of the strings above). At the moment the script is just lumping everything together as 1 record.

var x = 17; // Left position from page origin (mm)
var y = 42; // Top position from page origin (mm)
var w = 71; // Width of the region (mm)
var h = 14; // Height of the region (mm)

var myRegion = region.createRegion(x, y, w, h);
var regionStrings = boundaries.get(myRegion);

if (regionStrings && regionStrings.length) {
for (var i = 0; i < regionStrings.length; i++) {
if (/private and confidential|residential security act/i.test(regionStrings[i])) {
boundaries.set(myRegion);
break;
}
}
}

Hello @marrd , can you make the following adjustments and try again please?

var x = 17;
var y = 42;
var w = 71;
var h = 14;

var myRegion = region.createRegion(x, y, w, h);
var regionStrings = boundaries.get(myRegion);

//if (regionStrings && regionStrings.length) {
if (regionStrings) {
	for (var i = 0; i < regionStrings.length; i++) {
		if (/private and confidential|residential security act/i.test(regionStrings[i])) {
			//boundaries.set(myRegion);
			boundaries.set();
			//break;
		}
	}
}

Yes, that now works correctly thanks Marten

1 Like