Last Updated: September 29, 2017
·
14.18K
· inuyasha82

How to parse a file XML without root (or a malformed XML) in Java

The XML format, requires to be valid a unique root element declared in the document.
So for example a valid xml is:

<root>
     <element>...</element>
     <element>...</element>
</root>

But if you have a document like:

<element>...</element>
<element>...</element>
<element>...</element>
<element>...</element>

This will be considered a malformed XML, so many xml parsers just throw an Exception complaining about no root element. Etc.

In this example there is a solution on how to solve that problem and succesfully parse the malformed xml above.

Basically what we will do is to add programmatically a root element.

So first of all you have to open the resource that contains your "malformed" xml (i. e. a file):

File file = new File(pathtofile);

Then open a FileInputStream:

FileInputStream fis = new FileInputStream(file);

If we try to parse this stream with any XML library at that point we will raise the malformed document Exception.

Now we create a list of InputStream objects with three lements:

  1. A ByteIputStream element that contains the string: "<root>"
  2. Our FileInputStream
  3. A ByteInputStream with the string: "</root>"

So the code is:

List<InputStream> streams = 
    Arrays.asList(
        new ByteArrayInputStream("<root>".getBytes()),
    fis,
    new ByteArrayInputStream("</root>".getBytes()));

Now using a SequenceInputStream, we create a container for the List created above:

InputStream cntr = 
new SequenceInputStream(Collections.enumeration(str));

Now we can use any XML Parser library, on the cntr, and it will be parsed without any problem. (Checked with Stax library);

2 Responses
Add your response

Very nice tip. thanks!

over 1 year ago ·

Ivan,
I believe the last line of code should be (change str to streams):
InputStream cntr = new SequenceInputStream(Collections.enumeration(streams));

over 1 year ago ·