XPath选择Java中的节点列表

时间:2015-07-18 15:53:07

标签: java xml xpath

我有以下XML文件:

<RecordSet>
  <Record>
    <ID>001</ID>
    <TermList>
      <Term>Term1</Term>
      <Term>Term2</Term>
      <Term>Term3</Term>
    </TermList>
  </Record>
  <Record>
    <ID>002</ID>
    <TermList>
      <Term>Term3</Term>
      <Term>Term4</Term>
      <Term>Term5</Term>
    </TermList>
  </Record>
</RecordSet>

并且需要将其解析为&#34; ID-Ter​​m&#34;文件,即

001 Term1
001 Term2
001 Term3
002 Term3
002 Term4
002 Term5

目前我有以下申请:

import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

import javax.xml.parsers.*;
import javax.xml.xpath.*;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class MedlineParser {

    public static void main(String[] args) {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        DocumentBuilder builder;
        Document doc = null;
        try {
            builder = factory.newDocumentBuilder();
            doc = builder.parse("/home/andrej/Documents/test.xml");
            // Create XPathFactory object
            XPathFactory xpathFactory = XPathFactory.newInstance();
            // Create XPath object
            XPath xpath = xpathFactory.newXPath();
            try {
                XPathExpression expr1 = xpath.compile("/RecordSet/Record/ID/text()");
                NodeList nodes1 = (NodeList) expr1.evaluate(doc, XPathConstants.NODESET);
                for (int i = 0; i < nodes1.getLength(); i++) {
                    String id = nodes1.item(i).getNodeValue();
                    XPathExpression expr2 = xpath.compile("/RecordSet/Record/TermList/Term/text()");
                    NodeList nodes2 = (NodeList) expr2.evaluate(doc, XPathConstants.NODESET);
                    for (int j = 0; j < nodes2.getLength(); j++) {
                        System.out.println(id + " " + nodes2.item(i).getNodeValue());
                    }
                }
            } catch (XPathExpressionException e) {
                e.printStackTrace();
            }

        } catch (IOException | ParserConfigurationException | SAXException e) {
            e.printStackTrace();
        }
    }
}

不幸的是,程序输出目前是:

001 Term1
001 Term1
001 Term1
001 Term1
001 Term1
001 Term1
002 Term2
002 Term2
002 Term2
002 Term2
002 Term2
002 Term2

知道XPath表达式有什么问题吗?

2 个答案:

答案 0 :(得分:1)

两个问题:

  1. XPath必须将第一个循环中迭代的Term节点的索引纳入帐户。您的当前XPath每次为每个ID节点获取所有 XPathExpression expr2 = xpath.compile("/RecordSet/Record[" + (i + 1) + "]/TermList/Term/text()"); 个节点。您应该将其更改为:

    for
  2. 内部j循环中有拼写错误。您应该使用i代替for (int j = 0; j < nodes2.getLength(); j++) { System.out.println(id + " " + nodes2.item(j).getNodeValue()); }

    df1

答案 1 :(得分:1)

似乎您正在打印所有ID和术语的笛卡尔积。

这会更容易:

  1. 使用XPath表达式/RecordSet/Record选择并循环遍历所有Record节点。
  2. 对于每个记录节点,使用Record-node作为上下文节点,选择id(使用XPath ID)和术语(使用XPath Termlist/Term)。