Java:JAXP意外解析值(将XML解析为List&lt; List <string>&gt;)</string>

时间:2014-06-17 12:06:52

标签: java xml jaxp

我有这样的XML文件:

<?xml version="1.0" encoding="ISO-8859-2"?>
<some some1="string" some2="string">
<value1>string</value1>
<value2>string</value2>
<position1>
  <someval1>string</someval1>
  <someval2>string</someval2>
  <someval3>string</someval3>
  <someval4>string</someval4>
</position1>
<position2>
  <someval1>string</someval1>
  <someval2>string</someval2>
  <someval3>string</someval3>
  <someval4>string</someval4>
</position2>

我写了下一段代码:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(Vars.LOCAL_PATH + fileName);
XPath xPath =  XPathFactory.newInstance().newXPath();
Element root = doc.getDocumentElement();
NodeList nl = root.getChildNodes();
ArrayList<String> tempData = new ArrayList<String>();

for (int i=0; i < nl.getLength() ; i++) {
    Node n = nl.item(i);
    if (n.getNodeType() == Node.ELEMENT_NODE) {
    NodeList current = n.getChildNodes();
    for (int j = 0; j < current.getLength(); j++) {
        tempData.add(current.item(j).getTextContent().trim());
        System.out.println(current.item(j).getTextContent().trim() + " - str to note every output line");
    }
    xmlData.add(tempData);
    tempData.clear();
    }
}

但结果是:

000/F/ZZZ/2001 - str to note every output line
2001-01-01 - str to note every output line
 - str to note every output line
USD - str to note every output line
 - str to note every output line
1 - str to note every output line
 - str to note every output line
EUR - str to note every output line
 - str to note every output line

为什么有空行?我的代码有什么问题?更多,System.out.println(current.getLength())给了我9,但为什么9,必须有4 ... 感谢。

1 个答案:

答案 0 :(得分:0)

在第二个for循环中,您循环遍历每个节点,而不是检测它是否是元素节点。您将获得9个节点,因为您在每个<someval>元素之前和之后计算4个元素节点+ 5个文本节点(包含空格 - 制表符,空格和换行符)。

如果您只想过滤元素节点,那么您需要像在上一个节点中那样测试该循环中当前节点的类型:

for (int j = 0; j < current.getLength(); j++) {
    if (current.item(j).getNodeType() == Node.ELEMENT_NODE) { // add this!
        tempData.add(current.item(j).getTextContent().trim());
        System.out.println(current.item(j).getTextContent().trim() + " - str to note every output line");
    }
}

现在它将不再打印空行,循环将为每个<position>元素迭代四次。