在XOM中解析XHTML文档时出现DTD下载错误

时间:2009-06-15 20:43:50

标签: java dtd xom

我正在尝试使用声明要使用的doctype来解析HTML文档 过渡性dtd如下:

<!DOCTYPE html PUBLIC“ - // W3C // DTD XHTML 1.0 Transitional // EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd” >

当我在文档上执行Builder.build时,我得到以下异常:

  java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
       at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1305)
       at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
       at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
       at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
       at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
       at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown Source)
       at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
       at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
       at nu.xom.Builder.build(Builder.java:1127)
       at nu.xom.Builder.build(Builder.java:1019)

如果我删除了doc类型声明,它解析就好了。我可以 成功从我的浏览器下载dtd,它告诉我 网址有效。我不想删除doc类型声明。是 有一种方法告诉建设者不要下载dtd或提供它 与另一个dtd?

2 个答案:

答案 0 :(得分:7)

这解决了这个问题:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            factory.setValidating(false);
            factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
            Document document = factory.newDocumentBuilder().parse(is);

答案 1 :(得分:3)

快速查看Builder的javadoc,我猜你可以通过构造函数提供一个EntityResolver XMLReader。我会尽量避免让解析器从互联网上下载文件。