我从谷歌了解到,使用XPath从XML中提取数据比使用DOM循环更有意义。
目前,我已经使用DOM实现了一个解决方案,但代码很冗长,感觉不整洁且不可维护,所以我想切换到更清洁的XPath解决方案。
假设我有这种结构:
<products>
<product>
<title>Some title 1</title>
<image>Some image 1</image>
</product>
<product>
<title>Some title 2</title>
<image>Some image 2</image>
</product>
...
</products>
我希望能够为每个<product>
元素运行for循环,并在此for循环中提取标题和图像节点值。
我的代码如下所示:
InputStream is = conn.getInputStream();
DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(is);
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("/products/product");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList products = (NodeList) result;
for (int i = 0; i < products.getLength(); i++) {
Node n = products.item(i);
if (n != null && n.getNodeType() == Node.ELEMENT_NODE) {
Element product = (Element) n;
// do some DOM navigation to get the title and image
}
}
在我的for
循环中,我将每个<product>
作为Node
投放到Element
。
我可以简单地使用我的XPathExpression
实例来编译并在XPath
或Node
上运行另一个Element
吗?
答案 0 :(得分:6)
是的,你可以随时这样做 -
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("/products/product");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
expr = xpath.compile("title"); // The new xpath expression to find 'title' within 'product'.
NodeList products = (NodeList) result;
for (int i = 0; i < products.getLength(); i++) {
Node n = products.item(i);
if (n != null && n.getNodeType() == Node.ELEMENT_NODE) {
Element product = (Element) n;
NodeList nodes = (NodeList) expr.evaluate(product,XPathConstants.NODESET); //Find the 'title' in the 'product'
System.out.println("TITLE: " + nodes.item(0).getTextContent()); // And here is the title
}
}
这里我举了一个提取'title'值的例子。以同样的方式你可以做'图像'
答案 1 :(得分:4)
我不是这种方法的忠实粉丝,因为在将XPath应用到文档之前,您必须构建一个文档(可能很昂贵)。
我发现VTD-XML在将XPath应用于文档时效率更高,因为您不需要将整个文档加载到内存中。以下是一些示例代码:
final VTDGen vg = new VTDGen();
vg.parseFile("file.xml", false);
final VTDNav vn = vg.getNav();
final AutoPilot ap = new AutoPilot(vn);
ap.selectXPath("/products/product");
while (ap.evalXPath() != -1) {
System.out.println("PRODUCT:");
// you could either apply another xpath or simply get the first child
if (vn.toElement(VTDNav.FIRST_CHILD, "title")) {
int val = vn.getText();
if (val != -1) {
System.out.println("Title: " + vn.toNormalizedString(val));
}
vn.toElement(VTDNav.PARENT);
}
if (vn.toElement(VTDNav.FIRST_CHILD, "image")) {
int val = vn.getText();
if (val != -1) {
System.out.println("Image: " + vn.toNormalizedString(val));
}
vn.toElement(VTDNav.PARENT);
}
}
另请参阅Faster XPaths with VTD-XML上的这篇文章。