Question

我想用Java编写xml文件中的所有 leaf 元素。假设我的xml结构类似于下面的示例，我想计算此文件中的所有name和id元素。我该怎么做？

Xml示例：

<set>
 <employee>
    <name> </name>
    <id></id>
 </employee> 
 <employee>
     <name> </name>
     <id></id>
  </employee>
</set>

尝试使用Java代码：

try {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document document = builder.parse(file.toFile());
    Element root = document.getDocumentElement();
    if (!root.hasChildNodes()) {
        paths.add(file);
    } else {
        System.out.println("Element Name in: "+file.getFileName());
        System.out.println("Root element: " + "Total count: " + root.getChildNodes().getLength());
        for (int i = 0; i < root.getChildNodes().getLength(); i++) {
            Node node = root.getChildNodes().item(i);
            if (node.getChildNodes().getLength() != 0) {
                System.out.println("name: " + node.getNodeName() + " size:"+ node.getChildNodes().getLength());
            }
        }
    }
} catch (ParserConfigurationException | SAXException e) {
    e.printStackTrace();
}

Answer 1

XPath是最好的方法。您可以在XPath表达式中使用两个斜杠来搜索所有级别：

XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate("//name|//id", document, 
    XPathConstants.NODESET);
int count = nodes.getLength();

<强>更新

既然问题是如何计算叶子元素而不管元素名称如何，那么XPath表达式应该是：

XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate("//*[not(*)]", document, 
    XPathConstants.NODESET);
int count = nodes.getLength();

Answer 2

注意： 此答案是关于计算具有特定已知名称（name和id）的元素数量。问题已更改为请求计算叶元素，此答案未涵盖。

要对XML文档执行完全深度优先搜索，您可以选择方法。

如果您只需要执行搜索而不需要其他操作，那么StAX解析器是性能和内存占用的最佳选择。

否则，DOM解析器可能是您的最佳选择。

如果您不想自己遍历XML树，可以使用XPath为您执行此操作。

这是所有三个示例，包含测试代码：

private static int countUsingStAX(String xml) throws XMLStreamException {
    int count = 0;
    XMLInputFactory factory = XMLInputFactory.newFactory();
    XMLStreamReader reader = factory.createXMLStreamReader(new StringReader(xml));
    while (reader.hasNext()) {
        int event = reader.next();
        if (event == XMLStreamConstants.START_ELEMENT) {
            String name = reader.getLocalName();
            if (name.equals("name") || name.equals("id"))
                count++;
        }
    }
    reader.close();
    return count;
}

private static int countUsingDOM(String xml) throws Exception {
    int count = 0;
    DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder domBuilder = domFactory.newDocumentBuilder();
    Document document = domBuilder.parse(new InputSource(new StringReader(xml)));
    Node node = document.getDocumentElement();
    while (node != null) {
        if (node.getNodeType() == Node.ELEMENT_NODE) {
            String name = node.getNodeName();
            if (name.equals("name") || name.equals("id"))
                count++;
        }
        if (node.getFirstChild() != null)
            node = node.getFirstChild();
        else {
            while (node != null && node.getNextSibling() == null)
                node = node.getParentNode();
            if (node != null)
                node = node.getNextSibling();
        }
    }
    return count;
}

private static int countUsingXPath(String xml) throws XPathException {
    String xpathExpr = "//*[self::name or self::id]";
    XPathFactory factory = XPathFactory.newInstance();
    XPath xPath = factory.newXPath();
    NodeList nodeList = (NodeList)xPath.evaluate(xpathExpr,
                                                 new InputSource(new StringReader(xml)),
                                                 XPathConstants.NODESET);
    return nodeList.getLength();
}

public static void main(String[] args) throws Exception {
    String xml = "<set>\r\n" +
                 " <employee>\r\n" +
                 "    <name> </name>\r\n" +
                 "    <id></id>\r\n" +
                 " </employee>\r\n" +
                 " <employee>\r\n" +
                 "     <name> </name>\r\n" +
                 "     <id></id>\r\n" +
                 "  </employee>\r\n" +
                 "</set>";
    System.out.println(countUsingStAX(xml));
    System.out.println(countUsingDOM(xml));
    System.out.println(countUsingXPath(xml));
}

所有三个都打印了数字4。

DOM遍历也可以使用递归来完成，例如使用getChildNodes()。

Answer 3

您是否看过这篇文章：Mkyong

它的主旨是：

String filepath = "c:\\file.xml";
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(filepath);

NodeList list = doc.getElementsByTagName("employee");

然后得到你的计数：

System.out.println("Total of elements : " + list.getLength());

如何在java中的xml文件中计算叶元素

3 个答案: