获取xml中所有节点的Xpath

时间:2014-06-26 18:02:26

标签: java dom xpath

我有一个xml。我想使用Java获取/打印其中所有节点的Xpath(完整)。我正在尝试使用DOM解析器。

File stocks = new File("File Name");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); 
Document doc = dBuilder.parse(stocks); 
System.out.println("Parsed successfully"); 
doc.getDocumentElement();        
System.out.println("root of xml file : " + doc.getDocumentElement().getNodeName());

我可以让我的根节点打印出来,但不是它的孩子。

1 个答案:

答案 0 :(得分:3)

奇怪的是,我刚刚编写了一个可用于此的方法。但是,这并不完全支持名称空间,因此请注意,它也仅适用于ELEMENT类型。


要使此方法起作用,您还需要使文档具有名称空间感知功能。 dbFactory.setNamespaceAware(true);。如果您无法识别名称空间,请将getLocalName()转换为getTagName()


try {
    XPath xpath = XPathFactory.newInstance().newXPath();
    // get all nodes in the document
    NodeList nList = (NodeList) xpath.evaluate("//*", doc.getDocumentElement() ,XPathConstants.NODESET);

    for(int i=0;i<nList.getLength();i++) {
        if(nList.item(i).getNodeType() == Node.ELEMENT_NODE)
            System.out.println(getElementXPath((Element)nList.item(i), doc.getDocumentElement()));
    }
} catch (XPathExpressionException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}


 /**
 * Finds the xPath relative to the given node, the relativeTo should always be a parent of elt
 * @param elt 
 * @param relativeTo should be a parent of elt, if it isnt the path from the document root will be returned
 * @return
 */
public static String getElementXPath(Element elt, Element relativeTo) {
    String path = ""; 

    do {
        String xname = elt.getLocalName() + "[" + getElementIndex(elt) + "]";
        path = "/" + xname + path;

        if(elt.getParentNode() != null && elt.getParentNode().getNodeType() == Element.ELEMENT_NODE)
            elt = (Element) elt.getParentNode();
        else
            elt = null;
    } while(elt != null && !elt.equals(relativeTo));

    return path;                            
}

/**
 * @param original
 * @return the index this element is among its siblings, only accounts for siblings with the same tag name as itself. Used for xpath indexing
 */
private static int getElementIndex(Element original) {
    int count = 1;

    for (Node node = original.getPreviousSibling(); node != null; node = node.getPreviousSibling()) {
        if (node.getNodeType() == Node.ELEMENT_NODE) {
            Element element = (Element) node;
            if (element.getLocalName().equals(original.getLocalName()) && 
                    (element.getNamespaceURI() == original.getNamespaceURI() || (element.getNamespaceURI() != null && element.getNamespaceURI().equals(original.getNamespaceURI())))) {
                count++;
            }
        }
    }

    return count;
}