This was an issue that took me some time:
I need to validate a document using xerces parser. The document references a .xsd schema, which in turn references a DTD schema (http://www.w3.org/2001/XMLSchema.dtd). However, the document would not validate when there was no connection to Internet.
Using
URL mySchemaURL = getClass().getResource("/com/mycompany/mySchema.xsd");documentBuilderFactory.setAttribute("http://apache.org/xml/properties/schema/external-schemaLocation", "http://server.com/myschema " + mySchemaURL.toString());
I managed to make the parser read the XML schema from the JAR file. However this did not work for the DTD referenced from mySchema.xsd.
The error message was rather unhelpful:
org.xml.sax.SAXParseException: schema_reference.4:Failed to read schema document 'myschema.xsd', because1) could not find the document;2) the document could not be read;3) the root element of the document is not <xsd:schema>.
It turns out, the parser was trying to retrieve http://www.w3.org/2001/XMLSchema.dtd from the Internet. The solution with external-schemaLocation did not work for DTDs.
I then found an article (http://tynne.de/xerces-w3c) about caching DTDs, which led me to following solution:
class DTDResponseCache extends ResponseCache {Map<URI, String> savedDTD;/**
* Original cache used for requests other than specified DTDs*/ResponseCache originalCache;public DTDResponseCache(ResponseCache originalCache) {this.originalCache = originalCache;savedDTD = new HashMap<URI, String>();//Add your DTDs heresavedDTD.put(URI.create("http://www.w3.org/2001/XMLSchema.dtd"),"/com/mycompany/dtds/XMLSchema.dtd");savedDTD.put(URI.create("http://www.w3.org/2001/datatypes.dtd"),"/com/mycompany/dtds/datatypes.dtd");}@Overridepublic CacheResponse get(final URI uri, String rqstMethod, Map<String, List<String>> rqstHeaders) throws IOException {if (savedDTD.containsKey(uri)) {return new CacheResponse() {@Overridepublic Map<String, List<String>> getHeaders() throws IOException {Map<String, List<String>> headers = new HashMap<String, List<String>>();return headers;}@Overridepublic InputStream getBody() throws IOException {return getClass().getResourceAsStream(savedDTD.get(uri));}};} else {if(originalCache != null){return originalCache.get(uri, rqstMethod, rqstHeaders);} else {return null;}}}@Overridepublic CacheRequest put(URI uri, URLConnection conn) throws IOException {if (originalCache != null) {return originalCache.put(uri, conn);} else {return null;}}}
Now you only save the DTDs somewhere accesible to your classloader and then on application startup:
ResponseCache.setDefault(new DTDResponseCache(ResponseCache.getDefault()));
Which will preserve your previous caching settings (if any).
No comments:
Post a Comment