In my workflow I have created a script that merges several smaller XML files into 1 big XML file.
The merging works fine but afterwards I noticed that it contains illegal characters like a ’ (which should be converted into '). Note that the source XML does not contain these characters.
I have found some additional code on internet to save it in a different way but unfortunately this has no effect for the output. Here is a small snippet for just loading and saving a XML file:
Dim SourceXML
Set SourceXML = CreateObject("MSXML2.DomDocument.6.0")
if SourceXML.Load(full_path_to_source_file) then
FormatDocToFile SourceXML, full_path_to_output_file
end if
Sub FormatDocToFile(xmlDom , sFileName)
Set strm = CreateObject("ADODB.Stream")
With strm
.Open
.Type = 1 'adTypeBinary
set writer = CreateObject("Msxml2.MXXMLWriter")
With writer
.omitXMLDeclaration = False
.standalone = True
.byteOrderMark = False
.encoding = "ISO-8859-1"
.indent = True
.output = strm
.disableOutputEscaping = False
set reader = CreateObject("Msxml2.SAXXMLReader.6.0")
With reader
Set .contentHandler = writer
Set .dtdHandler = writer
Set .errorHandler = writer
.putProperty "http://xml.org/sax/properties/lexical-handler", writer
.putProperty "http://xml.org/sax/properties/declaration-handler", writer
.parse xmlDom
End With
End With
.SaveToFile sFileName, 2
.Close
End With
End Sub
The source XML for example contains this tag:
<equiptype>20' Tank Container</equiptype>
The output file will contain:
<equiptype>20' Tank Container</equiptype>
Is there a way to make sure that illegal characters are properly converted in my output file?