该程序从我的站点读取两个HTML,然后解析每个。 第一个HTML(pass.html)中没有DOCTYPE声明。 pass.html正常解析。
第二个HTML(freeze.html)
有一个DOCTYPE声明。
freeze.html被认为是
fully
valid
通过W3C的验证服务。
但是,当我尝试解析freeze.html时,程序会冻结.parse(is)
有什么问题?
import java.io.InputStream;
import java.net.URL;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
class DOMCallFreezes {
public static void main(String[] args) throws Exception {
new DOMCallFreezes().main();
}
void main() throws Exception {
demo("pass.html");
demo("freeze.html");
}
void demo(String htmlName) throws Exception {
final String baseUrl = "http://x19290.appspot.com/dom-no-good/";
URL url = new URL(baseUrl + htmlName);
try (final InputStream is = url.openStream()) {
final Document doc = newDocumentBuilder().parse(is);
final DOMSource src = new DOMSource(doc);
final StreamResult dst = new StreamResult(System.out);
newTransformer().transform(src, dst);
}
}
DocumentBuilder newDocumentBuilder() throws Exception {
final DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
return f.newDocumentBuilder();
}
Transformer newTransformer() throws Exception {
final TransformerFactory f = TransformerFactory.newInstance();
return f.newTransformer();
}
}
pass.html
<?xml version="1.0" encoding="US-ASCII"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>pass</title>
</head>
<body>
<h1>no DOCTYPE declaration</h1>
</body>
</html>
freeze.html
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>freeze</title>
</head>
<body>
<h1>has DOCTYPE declaration</h1>
</body>
</html>
答案 0 :(得分:1)
以下设置指示解析器不要从DOCTYPE声明加载外部DTD。更改方法newDocumentBuilder()
:
DocumentBuilder newDocumentBuilder() throws Exception {
final DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
f.setValidating(false);
f.setAttribute("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
return f.newDocumentBuilder();
}