我想用Java编写xml文件中的所有 leaf 元素。假设我的xml结构类似于下面的示例,我想计算此文件中的所有name
和id
元素。我该怎么做?
Xml示例:
<set>
<employee>
<name> </name>
<id></id>
</employee>
<employee>
<name> </name>
<id></id>
</employee>
</set>
尝试使用Java代码:
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(file.toFile());
Element root = document.getDocumentElement();
if (!root.hasChildNodes()) {
paths.add(file);
} else {
System.out.println("Element Name in: "+file.getFileName());
System.out.println("Root element: " + "Total count: " + root.getChildNodes().getLength());
for (int i = 0; i < root.getChildNodes().getLength(); i++) {
Node node = root.getChildNodes().item(i);
if (node.getChildNodes().getLength() != 0) {
System.out.println("name: " + node.getNodeName() + " size:"+ node.getChildNodes().getLength());
}
}
}
} catch (ParserConfigurationException | SAXException e) {
e.printStackTrace();
}
答案 0 :(得分:3)
XPath是最好的方法。您可以在XPath表达式中使用两个斜杠来搜索所有级别:
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate("//name|//id", document,
XPathConstants.NODESET);
int count = nodes.getLength();
<强>更新强>
既然问题是如何计算叶子元素而不管元素名称如何,那么XPath表达式应该是:
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate("//*[not(*)]", document,
XPathConstants.NODESET);
int count = nodes.getLength();
答案 1 :(得分:2)
注意: 此答案是关于计算具有特定已知名称(name
和id
)的元素数量。问题已更改为请求计算叶元素,此答案未涵盖。
要对XML文档执行完全深度优先搜索,您可以选择方法。
如果您只需要执行搜索而不需要其他操作,那么StAX解析器是性能和内存占用的最佳选择。
否则,DOM解析器可能是您的最佳选择。
如果您不想自己遍历XML树,可以使用XPath为您执行此操作。
这是所有三个示例,包含测试代码:
private static int countUsingStAX(String xml) throws XMLStreamException {
int count = 0;
XMLInputFactory factory = XMLInputFactory.newFactory();
XMLStreamReader reader = factory.createXMLStreamReader(new StringReader(xml));
while (reader.hasNext()) {
int event = reader.next();
if (event == XMLStreamConstants.START_ELEMENT) {
String name = reader.getLocalName();
if (name.equals("name") || name.equals("id"))
count++;
}
}
reader.close();
return count;
}
private static int countUsingDOM(String xml) throws Exception {
int count = 0;
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder domBuilder = domFactory.newDocumentBuilder();
Document document = domBuilder.parse(new InputSource(new StringReader(xml)));
Node node = document.getDocumentElement();
while (node != null) {
if (node.getNodeType() == Node.ELEMENT_NODE) {
String name = node.getNodeName();
if (name.equals("name") || name.equals("id"))
count++;
}
if (node.getFirstChild() != null)
node = node.getFirstChild();
else {
while (node != null && node.getNextSibling() == null)
node = node.getParentNode();
if (node != null)
node = node.getNextSibling();
}
}
return count;
}
private static int countUsingXPath(String xml) throws XPathException {
String xpathExpr = "//*[self::name or self::id]";
XPathFactory factory = XPathFactory.newInstance();
XPath xPath = factory.newXPath();
NodeList nodeList = (NodeList)xPath.evaluate(xpathExpr,
new InputSource(new StringReader(xml)),
XPathConstants.NODESET);
return nodeList.getLength();
}
public static void main(String[] args) throws Exception {
String xml = "<set>\r\n" +
" <employee>\r\n" +
" <name> </name>\r\n" +
" <id></id>\r\n" +
" </employee>\r\n" +
" <employee>\r\n" +
" <name> </name>\r\n" +
" <id></id>\r\n" +
" </employee>\r\n" +
"</set>";
System.out.println(countUsingStAX(xml));
System.out.println(countUsingDOM(xml));
System.out.println(countUsingXPath(xml));
}
所有三个都打印了数字4
。
DOM遍历也可以使用递归来完成,例如使用getChildNodes()
。
答案 2 :(得分:0)
您是否看过这篇文章:Mkyong
它的主旨是:
String filepath = "c:\\file.xml";
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(filepath);
NodeList list = doc.getElementsByTagName("employee");
然后得到你的计数:
System.out.println("Total of elements : " + list.getLength());