为什么我得到“org.w3c.dom.DOMException:HIERARCHY_REQUEST_ERR”?

时间:2015-08-03 01:06:45

标签: java html xpath

完整的异常堆栈:

Exception in thread "main" org.w3c.dom.DOMException: HIERARCHY_REQUEST_ERR: An attempt was made to insert a node where it is not permitted. 
    at org.apache.xerces.dom.CoreDocumentImpl.insertBefore(Unknown Source)
    at org.apache.xerces.dom.NodeImpl.appendChild(Unknown Source)
    at com.enniu.crawler.core.saxon.main(saxon.java:39)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)

我的代码:

public class saxon {

    public static void main(String args[]) throws IOException, SAXException, ParserConfigurationException, XPathFactoryConfigurationException, XPathExpressionException {

        DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
        domFactory.setNamespaceAware(true);
        DocumentBuilder builder = null;
        builder = domFactory.newDocumentBuilder();
        Document doc = builder.parse("test.html");
        Document newDoc = builder.newDocument();
        XPathFactory xpf = XPathFactoryImpl.newInstance(XPathConstants.DOM_OBJECT_MODEL);
        XPath xPath = xpf.newXPath();
        XPathExpression compile = xPath.compile("//div[not (contains(class, 'sss'))]");
        Object result = compile.evaluate(doc, XPathConstants.NODESET);
        NodeList nodes = (NodeList) result;
        for(int i = 0; i < nodes.getLength(); i++) {
            Node copyNode = newDoc.importNode(nodes.item(i), true);
            newDoc.appendChild(copyNode);// line 39
        }
        printXmlDocument(newDoc);
    }

    public static void printXmlDocument(Document document) {
        DOMImplementationLS domImplementationLS =
                (DOMImplementationLS) document.getImplementation();
        LSSerializer lsSerializer =
                domImplementationLS.createLSSerializer();
        String string = lsSerializer.writeToString(document);
        System.out.println(string);
    }
}

的test.html

<table>
    <div>aa</div>
    <div class="sss">ss</div>
    <div>dd</div>
</table>

1 个答案:

答案 0 :(得分:2)

因为有效的http文档不能有两个根。我的代码尝试生成如下文档:

<div>aa</div>
<div>dd</div>

文档中有两个根,因此获得异常。