In Datamapper I try to create a regex to split address parts (only Mid-European adresses) into
street
(countrycode) ZIP City
country
but failed
Examples of the possible formats (captured lines already seperated by |)
Sir|Alfred Testman |c/o Watermelon 2245 Inc|Brownstreet 13|6280 Hochdorf|SCHWEIZ
Alfred Testman |Testman 2245|Yellowlane 13|CH-6280 Hochdorf|SCHWEIZ
Boris Checkman |c/o Bavarian 5555|Morninglane 13|CH-6280 Hochdorf
Peter Pan|Oststrasse 13|99999 Simcity
goal is to identify the first 4-5 digit ZIP code (with the approrpiate countrycode)
record.fields.AdrBlock.match(/(\d{4,5}\s)(.[^|]+)/gi).slice(-1)[0].trim(); //works but is stripping the country code. Prerequisite:
Country may be empty, ZIP may be preceeded by countrycode, ZIP is 4-5 digits, street is always the line before the ZIP/city.
So result should look like
Line 1
Street: Brownstreet 13
Zipcity: 6280 Hochdorf
country: SCHWEIZ
Line 2:
Street: Yellowlane 13
ZipCity: CH-6280 Hochdorf
country: SCHWEIZ
Line 3:
Street: Morninglane 13
Zipcity: CH-6280 Hochdorf
country: null
Phil, thx!
Changed it to /(?<=\|)(([A-Za-z]{2}[- ])?\d{4,5}\s)(.[^|]+)/gi
to eliminate wrong matching (Line 1) adding a possible space between countrycode and ZIP and checking for a leading ā|ā, ZIP+City now work (Sometimes I canāt see the wood for trees).
And: any idea to capture the line before and after that line (e.g. the previous and following text surrounded by ā|ā)?
Hi @RalfG, I assume that the Data Mapper cannot handle the following part of your Regular Expression: ā(?<=|)ā because without it the Regular Expression seems to work fine.
The expression (?<=|) triggers the RegEx engineās lookbehind functionality (i.e. the full regular expression is a match if, and only if, the preceding character is a |). That functionality was added in the ECMASCRIPT 2018 specification, but the DataMapperās JavaScript engine implements the ECMASCRIPT 2016 spec, so lookbehind is not supported.
But in your case, you donāt have to use lookbehind. You can adjust your RegEx to look for the | character without capturing it:
Notice the /i)[1] options and index at the end of the statement, which instruct the JS engine to retrieve the content of the first capturing group instead of the fully matched expression.
then in extraction just pulled the array objects adding the international value to the array field:
e.g.Street:
sourceRecord.properties.Adressblock[1+sourceRecord.properties.international];
e.g. City:
sourceRecord.properties.Adressblock[(0+sourceRecord.properties.international)];