SAX Error – Content is not allowed in prolog
We use SAX parser to parse an XML file, and hist the following error message:
org.xml.sax.SAXParseException; systemId: ../src/main/resources/staff.xml;
lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
In short, invalid text or BOM before the XML declaration or different encoding will cause the SAX Error – Content is not allowed in prolog
.
- 1. Invalid text before the XML declaration.
- 2. BOM at the beginning of the XML file.
- 3. Different encoding format
- 4. Download Source Code
- 5. References
1. Invalid text before the XML declaration.
At the beginning of the XML declaration, any text will cause the Content is not allowed in prolog
error.
For example, the below XML file contains an extra small dot .
before the XML declaration.
.<?xml version="1.0" encoding="utf-8"?>
<company>
<staff>
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
</company>
To fix it
Delete any text before the XML declaration.
<?xml version="1.0" encoding="utf-8"?>
<company>
<staff>
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
</company>
2. BOM at the beginning of the XML file.
Many text editors auto adds BOM to the UTF-8 file.
Note
Read the following articles:
Tested with Java 11 and Java 8, the built-in SAX parser can parse the BOM UTF-8 file correctly; however, some developers claimed the BOM caused an error for XML parsing.
To fix it, remove the BOM from the UTF-8 file.
- Remove the BOM via code
- In notepad++, check Encoding
UTF-8 without BOM
. - In Intellij IDE, right on the file, select
Remove BOM
.
P.S Many text or code editors have features to add or remove byte order mark (BOM) for a file, try find the feature in the menu.
3. Different encoding format
The different encoding also caused the popular XML Content is not allowed in prolog.
For example, a UTF-8 XML file.
<?xml version="1.0" encoding="utf-8"?>
<Company>
<staff id="1001">
<name>mkyong</name>
<role>support</role>
<salary currency="USD">5000</salary>
<!-- for special characters like < &, need CDATA -->
<bio><![CDATA[HTML tag <code>testing</code>]]></bio>
</staff>
<staff id="1002">
<name>yflow</name>
<role>admin</role>
<salary currency="EUR">8000</salary>
<bio><![CDATA[a & b]]></bio>
</staff>
</Company>
And we use a UTF-16 encoding to parse the above UTF-8 encoding XML file.
SAXParserFactory factory = SAXParserFactory.newInstance();
try (InputStream is = getXMLFileAsStream()) {
SAXParser saxParser = factory.newSAXParser();
// parse XML and map to object, it works, but not recommend, try JAXB
MapStaffObjectHandlerSax handler = new MapStaffObjectHandlerSax();
// more options for configuration
XMLReader xmlReader = saxParser.getXMLReader();
xmlReader.setContentHandler(handler);
InputSource source = new InputSource(is);
// UTF-16 to parse an UTF-8 XML file
source.setEncoding(StandardCharsets.UTF_16.toString());
xmlReader.parse(source);
// print all
List<Staff> result = handler.getResult();
result.forEach(System.out::println);
} catch (ParserConfigurationException | SAXException | IOException e) {
e.printStackTrace();
}
Output
[Fatal Error] :1:1: Content is not allowed in prolog.
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1243)
at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635)
at com.mkyong.xml.sax.ReadXmlSaxParser2.main(ReadXmlSaxParser2.java:45)
4. Download Source Code
$ git clone https://github.com/mkyong/core-java
$ cd java-xml
$ cd src/main/java/com/mkyong/xml/sax/
Hi All
Check Encoding “UTF-8 without BOM” in notepad++
if nothing is there b4
This solution works. Thanks
perfect thanks
This worked for me. May be you should update this fix in the above section.
Thanks, worked for me! 🙂
how solved ???
in my case there was no BOM character, so tried adding setValidation(false) before setting the xmlDoc object and it worked.
factory.setValidating(false);
xmlDoc = factory.newDocumentBuilder().parse(filePath);
BOM encoded files crashes with same , so wrapping the inputstream using apach BomInputStream solved the issue
Thanks for the post. was useful to me