Multiples Data Samples

Thales · August 22, 2024, 12:45pm

I have the following situation:
I have text files that can contain the following data in the first line: AEH036 and AEH046, indicating their type, where I see that with Multiple Conditions I can extract data from each type within the same DataMapper and in Design I would create conditions for each “type” to call a page with a specific layout. However, I can receive files of other types, AEH026, AEH055…, where each type has a different data structure, making it necessary to perform extractions for each type of form.

Within this same DataMapper, can I place all these possible types of files as a Data Sample and perform customized extractions through Conditions?

The Design file will only be 1 with several pages that have the layouts for each type of file and are called according to the type, which comes on the first line of each record.

Regards,

jchamel · August 22, 2024, 1:02pm

When you say “different data structure”, do you mean each feild is located at different position in the text file or are those different type of data file like CSV, line printer and XML?

Thales · August 22, 2024, 1:18pm

Different positions.
The file used Channel Skip emulation on the old PP7, today we are trying to migrate

Phil · August 22, 2024, 1:59pm

Yes you can, but that doesn’t mean you should.

In PlanetPress, it made sense to try and address all data streams at once because the data extraction and document composition all happened in the same template.

In OL Connect, these steps are separate: you can have several data mapping configs that generate the same data model, which can then be fed into a single template (or multiple ones).

It might be easier to maintain several simple DM configs instead of a big one that contains all possible conditions. Not to mention that each individual DM Config is likely to be more efficient because it will deal with a single type of data.

Thales · August 22, 2024, 2:21pm

Yeah, but in every text file I receive, it will come with several types.
Eg.:
JOB1234
Contains data AEH036 and AEH046.
I make a DM for each AEHXX separately, but using the same sample data file, considering that I cannot separate the data from this JOB1234, correct?
And in my design file, would I have to create a file for each template and use that specific DM or can I use all the DMs and create several print sections with conditions based on these DMs?

Phil · August 22, 2024, 3:20pm

If a single file may contain several data types, then your approach of using conditional statements is correct, in order to handle the various cases in a single DM config.

You will probably want to use the Multiple Conditions step instead of using nested condition steps.

Thales · August 22, 2024, 5:21pm

Tks Phil,
But my big problem is mapping the JOB1234 file which contains AE036 and AEH046, but at the same time I can receive JOB5678, which can contain AEH026 and AEH028 data, what would be the best way to map?
Within the same DATA MAPPER using multiple conditions for each JOBXXX FILE or creating a DM for each AEH that may be in the data files, and how would I take several DMs to my designer so that multiple design files are not created but only 1 with multiple sections?

Phil · August 22, 2024, 5:57pm

You cannot use several data models with a single template unless those data models are identical (or one is an exact subset of the other).

It’s difficult to advise you on how to perform the extraction and the mapping of the data onto the template without analyzing the various requirements.

Personally, I prefer having less complex DM configs and templates, even if that means having to create more of them. Simple resources are always easier to maintain and provide better performance.

But obviously, if all your templates are very similar, then it makes sense to use a single template with a few conditions in there. But that means you have to create DM configs that generate data models that are compatible with that template.

I’m sorry if my advice is vague, but as you have probably realized by now, there are so many different ways of achieving similar results with OL Connect that it’s often a question of picking the method you are most comfortable with.

Thales · August 22, 2024, 8:11pm

What would be the best way to add one or two blank lines at the end of each record so that I can avoid “Is out of bound” errors?

In this example 036 the data is presented in a few ways, one is the error.

And the other is the file named “Correct”`.

Phil · August 22, 2024, 8:55pm

You could write a preprocessor script that adds a certain number of lines to all records (search for “preprocessor” in these forums, there are many examples of modifying the data).

But you could also just check, with a condition, if the line you are trying to read is equal to or less than the total number of lines in the record :

In this instance, the DataMapper extracts data from line 57, but only if the number of lines in the record is equal to or greater than 57.

Thales · August 22, 2024, 9:39pm

I tried step.lines but as the first image record has 24 lines and the same model can come with 23 lines, I always come across the limits error, as the line 24 field already mapped previously would not exist in the model with 22 lines.

Regarding the preprocessor, I tried to script it, but I always come up against the condition of what an EOF would be like, so that it knows exactly where the end of the file will be, so that it can insert the lines.

var reader = openTextReader(data.filename, "UTF-8"); 
var tmpFile = createTmpFile(); 
var writer = openTextWriter(tmpFile.getPath(), "UTF-8"); 
var line = null; 

// Copia o conteúdo do arquivo original para o arquivo temporário
while((line = reader.readLine()) != null){ 
    writer.write(line + "\n"); 
}

// Adiciona duas linhas em branco ao final do arquivo temporário
writer.write("\n\n");

writer.close(); 
reader.close(); 

// Substitui o arquivo original pelo arquivo temporário
deleteFile(data.filename); 
tmpFile.move(data.filename);

Phil · August 23, 2024, 11:09am

If you have a sample data file that you can share in private, I will provide you with the appropriate script. Make sure that the sample data contains more than 1 record.

Thales · August 23, 2024, 11:43am

Okay, I just sent it.

Phil · August 23, 2024, 12:41pm

The following preprocessor script ensures all data records contain 50 lines.

var inFile= openTextReader(data.filename, "UTF-8"); 
var outFile = openTextWriter(data.filename+".tmp", "UTF-8"); 
var inRecord = false;
var maxLines = 50;
var lineCount = 0;
var newRecord = "1$NEW$";

var line = null; 

// Copia o conteúdo do arquivo original para o arquivo temporário
while((line = inFile.readLine()) != null){ 
	if(line.startsWith(newRecord)){
		if(inRecord) {
			for(var i = maxLines; i > lineCount; i--){
			    outFile.write("\n"); 
			}
		}
		lineCount=0;	
		inRecord=true;	
	}
    outFile.write(line + "\n"); 
    lineCount++;
}

// Adiciona duas linhas em branco ao final do arquivo temporário
outFile.write("\n\n");

outFile.close(); 
inFile.close(); 

// Substitui o arquivo original pelo arquivo temporário
deleteFile(data.filename); 
copyFile(data.filename+".tmp",data.filename);

Thales · August 27, 2024, 1:23pm

This script can help me in some way. But today I really need something that can perform the same role as the PP7 Channel Skip emulation, because with it it was able to adjust to variable lines within each file. Today in Connnect I see that the final lines “jump” to the top line, if it has no data, which did not happen in Channel Skip, the position was maintained with or without data in the previous line.

Phil · August 27, 2024, 2:20pm

It just so happens that we are currently working on a conversion service for PP7 templates. I know for a fact that the Channel Skip emulation is one of the items that the service attempts to converts to a Connect data mapping config. It does that by generating a preprocessing script.

Could you send me - in private - a PP7 template so that I can run it thorough that conversion service and see if it comes closer to what you’re looking for?