This was an issue that took me some time:
I need to validate a document using xerces parser. The document references a .xsd schema, which in turn references a DTD schema (http://www.w3.org/2001/XMLSchema.dtd). However, the document would not validate when there was no connection to Internet.
Using
URL mySchemaURL = getClass().getResource("/com/mycompany/mySchema.xsd");documentBuilderFactory.setAttribute("http://apache.org/xml/properties/schema/external-schemaLocation", "http://server.com/myschema " + mySchemaURL.toString());
I managed to make the parser read the XML schema from the JAR file. However this did not work for the DTD referenced from mySchema.xsd.
The error message was rather unhelpful:
org.xml.sax.SAXParseException: schema_reference.4:Failed to read schema document 'myschema.xsd', because1) could not find the document;2) the document could not be read;3) the root element of the document is not <xsd:schema>.
It turns out, the parser was trying to retrieve http://www.w3.org/2001/XMLSchema.dtd from the Internet. The solution with external-schemaLocation did not work for DTDs.
I then found an article (http://tynne.de/xerces-w3c) about caching DTDs, which led me to following solution:
class DTDResponseCache extends ResponseCache {
Map<URI, String> savedDTD;
/**
* Original cache used for requests other than specified DTDs
*/
ResponseCache originalCache;
public DTDResponseCache(ResponseCache originalCache) {
this.originalCache = originalCache;
savedDTD = new HashMap<URI, String>();
//Add your DTDs here
savedDTD.put(URI.create("http://www.w3.org/2001/XMLSchema.dtd"),"/com/mycompany/dtds/XMLSchema.dtd");
savedDTD.put(URI.create("http://www.w3.org/2001/datatypes.dtd"),"/com/mycompany/dtds/datatypes.dtd");
}
@Override
public CacheResponse get(final URI uri, String rqstMethod, Map<String, List<String>> rqstHeaders) throws IOException {
if (savedDTD.containsKey(uri)) {
return new CacheResponse() {
@Override
public Map<String, List<String>> getHeaders() throws IOException {
Map<String, List<String>> headers = new HashMap<String, List<String>>();
return headers;
}
@Override
public InputStream getBody() throws IOException {
return getClass().getResourceAsStream(savedDTD.get(uri));
}
};
} else {
if(originalCache != null){
return originalCache.get(uri, rqstMethod, rqstHeaders);
} else {
return null;
}
}
}
@Override
public CacheRequest put(URI uri, URLConnection conn) throws IOException {
if (originalCache != null) {
return originalCache.put(uri, conn);
} else {
return null;
}
}
}
Now you only save the DTDs somewhere accesible to your classloader and then on application startup:
ResponseCache.setDefault(new DTDResponseCache(ResponseCache.getDefault()));
Which will preserve your previous caching settings (if any).
No comments:
Post a Comment