Xml vs json file formats for transfer across network

Hi there, slightly unusual post but hoping someone could help. I’m going to be using connect as part of a solution for an academic course that includes a work based project. Project is replacing manual process for car parks staff who use local printer, generate their own reports, envelope stuffing etc. As part of the project I need to investigate file format for data transfer, these files will include images taken of offending car/van. Is there any material like case studies OL have access to where this kind of project has been done before? I’m looking to justify why you would select one format over another XML v JSON for example. I appreciate this is not a support request and you chaps are very busy, hopefully someone could perhaps have a link to case studies carried out by OL or something similar. Thanks very much.

We don’t have case studies that are that specific… but I can give you my opinion on the matter.

From the DataMapper’s point of view, XML and JSON and almost equal in how they are accessed and processed. Both are structured formats that allow for easy representation of hierarchical structures.

However, there are a few key differences:

  • XML is more verbose. It requires end-tags, whereas JSON doesn’t. And because XML is usually the default format when exporting database content, it often contains a slew of empty elements (e.g. <MyElement/>) that slow down processing even though they are optional and should not have been embedded in the data to start with. But if your XML is clean, then it really is very similar to JSON.

  • The JSON implementation in the DataMapper allows you to use the JSONPath query language, which makes it an extremely easy and efficient data format. In fact, from our own performance tests here, we have found JSON to be the fastest of all data formats in most cases. However, JSONPath is not as powerful as XPATH, so the performance comes at the cost of flexibility and versatility. But then again, you can learn JSONPath in minutes, while XPATH takes days (or more!).

Overall, XML requires more memory to process than JSON, so that’s a consideration when each of your records are fairly large. You mention images in your question, so that means each record could possibly contain very large blobs of data. JSON would handle those more efficiently.

And finally, with JSON having become the de facto standard in data exchange on the web, it is also extremely simple to process from within a Script, whereas XML is - to put it bluntly - a pain in the butt.

For all these reasons, I would definitely pick JSON as my format of choice.

Thanks Phil, thats really interesting, I appreciate you taking the time to reply.
May I ask a few dumb questions please having never used JSON with Connect?
Without the start/end tags provided by XML are you reliant on a UI being provided within the dataset to id the start/end of a record in the datamapper?
Am I right in thinking both formats use base64 encoding to convert an image to a string?
I’m guessing I could use CSS in the designer to make these images look pretty.
In terms of securing the data, it will be on an internal network can you use connect to secure the channel e.g https as pose to securing the data itself using some form of encryption.
Just a little background I will have access to the host system database but I’m not allowed to use connect to directly access the data. An application will be created to make the data useable for Connect, I want to make sure the app creates the file in the most friendly way for Connect to handle and after what you have mentioned in JSON format. Thanks again Phil.

As far as document boundaries in JSON files is concerned, it really is similar to what you’d see in XML, except that sometimes fields don’t necessarily have names. So for instance, look at the following JSON structure:

[ 
  {
    "name": "My First Record",
    "ID": "00000001"
  },
  {
    "name": "My Second Record",
    "ID": "00000002"
  }
]

With the JSON data type in the DataMapper, you have to specify which element is the parent of the element that constitutes the record boundary. In the above example, using JSONPath notation, you would specify $.[] (which means the first array in the file).

The equivalent XML structure would be something like this:

<documents>
  <document>
    <name>My First Record</name>
    <id>00000001</id>
  </document>
  <document>
    <name>My Second Record</name>
    <id>00000002</id>
  </document>
</documents>

For XML, you specify the actual element on which record boundaries occur, so that would be /documents/document.

As for images, you are right that in both cases they would be stored as Base64 blobs, so there really is no difference here, except that the JSON format will be slightly faster to process (not because of the blob itself but because of the way JSON is parsed vs. XML).

And yes, you can use HTTPS to fetch/send the data stream, there is no need for additional encryption.

Thanks very much Phil, that gives me exactly what I was looking for, interesting to see the difference in how the datamapper reads each format. I’m going to opt for JSON format we dont process files this way currently so it will be good to start. Thanks again Phil, this has been really useful.