Thu
8
Dec '05
|
by Frank Spychalski filed under articles
|
This post explains how to use XML Schema to validate your document and XInclude to partition it in manageable chunks in Java using Xerces.
For this example I use the following documents:
- main.xml the main document, imports include.xml
- include.xml the included document, contains just a single element
- main.xsd a simple schema for the main document
And this is the included file:
and a very simple Schema to validate the file:
The solution seems obvious, use a SAXParser and set the features for Schema Validation and XInclude to true.
Sadly, it fails with Exception in thread "main" org.jdom.JDOMException: Error on line 12 of document a pathmain.xml: Error attempting to parse XML file (href='include.xml'). Line 12 is the line containing the XInclude statement which suggested an error in the syntax. But multiple rereads of the spec didn’t help so turned off Schema Validation for a change. Now everything worked fine and a look at the document shows what went wrong with the Schema Validation:
Xerces inserts a xml:base attribute when in insert an element and this attribute is not allowed in my Schema. But fortunately I don’t have to change my schema to suit this. There’s a feature called fixup-base-uris (http://apache.org/xml/features/xinclude/fixup-base-uris) which removes these additional attributes.
This works fine, with one minor problem left: if there is an error in the included XML document, it fails with Exception in thread "main" org.jdom.JDOMException: Error on line 12 of document file:a path/main.xml: Error attempting to parse XML file (href='include.xml'). which is not very helpful. I’ll post a workaround for this problem tomorrow.
Thanks for this article. It was very helpful.
General FYI,
For more parser features, refer http://xerces.apache.org/xerces2-j/features.html