如果java中的XML中不存在节点或子节点,XPath如何返回空字符串

时间:2015-03-11 04:52:05

标签: java xml xpath

我有一个XML文件作为" sample.xml"并有4条记录。

    <?xml version='1.0' encoding='UTF-8'?>
<hello xmlns:show="http://www.example.com" xmlns:css="http://www.example.com" xml_version="2.0">


  <entry id="2008-0001">
    <show:id>2008-0001</show:id>
    <show:published-datetime>2008-01-15T15:00:00.000-05:00</show:published-datetime>
    <show:last-modified-datetime>2012-03-19T00:00:00.000-04:00</show:last-modified-datetime>
    <show:css>
      <css:metrics>
        <css:score>3.6</css:score>
        <css:access-vector>LOCAL</css:access-vector>
        <css:authentication>NONE</css:authentication>
        <css:generated-on-datetime>2008-01-15T15:22:00.000-05:00</css:generated-on-datetime>
      </css:metrics>
    </show:css>
    <show:summary>This is first entry.</show:summary>
  </entry>
  <entry id="2008-0002">
    <show:id>2008-0002</show:id>
    <show:published-datetime>2008-02-11T20:00:00.000-05:00</show:published-datetime>
    <show:last-modified-datetime>2014-03-15T23:22:37.303-04:00</show:last-modified-datetime>
    <show:css>
      <css:metrics>
        <css:score>5.8</css:score>
        <css:access-vector>NETWORK</css:access-vector>
        <css:authentication>NONE</css:authentication>
        <css:generated-on-datetime>2008-02-12T10:12:00.000-05:00</css:generated-on-datetime>
      </css:metrics>
    </show:css>
    <show:summary>This is second entry.</show:summary>
  </entry>

  <entry id="2008-0003">
    <show:id>2008-0003</show:id>
    <show:published-datetime>2009-03-26T06:12:08.780-04:00</show:published-datetime>
    <show:last-modified-datetime>2009-03-26T06:12:09.313-04:00</show:last-modified-datetime>
    <show:summary>This is 3rd entry with missing "css" tag and their metrics.</show:summary>
  </entry>

  <entry id="2008-0004">
    <show:id>CVE-2008-0004</show:id>
    <show:published-datetime>2008-01-11T19:46:00.000-05:00</show:published-datetime>
    <show:last-modified-datetime>2011-09-06T22:41:45.753-04:00</show:last-modified-datetime>
    <show:css>
      <css:metrics>
        <css:score>4.3</css:score>
        <css:access-vector>NETWORK</css:access-vector>
        <css:authentication>NONE</css:authentication>
        <css:generated-on-datetime>2008-01-14T09:37:00.000-05:00</css:generated-on-datetime>
      </css:metrics>
    </show:css>
    <show:summary>This is 4th entry.</show:summary>
  </entry>
</hello>

和1个Java文件作为&#34; Test.java&#34; -

    import java.io.File;
    import java.util.ArrayList;
    import java.util.List;

    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathConstants;
    import javax.xml.xpath.XPathExpression;
    import javax.xml.xpath.XPathFactory;

    import org.w3c.dom.Document;
    import org.w3c.dom.Node;
    import org.w3c.dom.NodeList;

public class Test {

    public static void main(String[] args) {



        List<String> list = new ArrayList<String>();


        File fXmlFile = new File("/home/ankit/sample.xml");

        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

        try
        {
            DocumentBuilder dBuilder = factory.newDocumentBuilder();

            Document doc = dBuilder.parse(fXmlFile);

            doc.getDocumentElement().normalize();

            NodeList nList = doc.getElementsByTagName("entry");

            XPathFactory xPathfactory = XPathFactory.newInstance();

            XPath xpath = xPathfactory.newXPath();


            for (int i = 0; i < nList.getLength(); i++)
            {

                XPathExpression expr1 = xpath.compile("//hello/entry/css/metrics/score");

                NodeList nodeList1 = (NodeList) expr1.evaluate(doc, XPathConstants.NODESET);

                if(nodeList1.item(i)!=null)
                {
                    Node currentItem = nodeList1.item(i);

                    if(!currentItem.getTextContent().isEmpty())
                    {
                        list.add(currentItem.getTextContent());
                        }
                }
            }
        }
        catch(Exception e)
        {
            e.printStackTrace();
        }

        System.out.println("size----"+list.size());
        for(int i=0;i<list.size();i++)
        {
            System.out.println("list----"+list.get(i));
        }
    }
}

我需要从XML中读取<entry>标记,并且我正在使用XPath。在XML文件中有4个条目标记和内部条目标记有<show:css>标记,但在第3个<entry>标记中,此<show:css>标记丢失并将这些css标记评分列表中的值。因此,当我运行这个java代码时,前2个值存储在列表中,而在第3个位置它存储第4个标记的css评分值。

我想要一个列表作为输出,它将第一个,第二个和第四个元素作为&#34; 3.6&#34;,“4.8”和“5.3”,第三个元素应该是空字符串或nill。但是我在列表中只获得了3个元素,值为1,2和4。

我需要在第3位放置空字符串“”,在第4位放置原始值。均值如果该标记不存在,则在列表中添加空白值。

当前输出 - [“3.6”,“4.8”,“5.3”]

我期待 - [“3.6”,“4.8”,“”,“5.3”]

有人可以帮我解决这个问题。

3 个答案:

答案 0 :(得分:1)

可能有几种方法可以实现......

我的基本观点是找到所有entry子节点都有css/metrics/score子节点但没有节点(你可能只能得到所有entry个节点,但是这证明了查询语言的强大功能)

像...一样的东西。

XPathExpression expr1 = xPath.compile("//hello/entry[css/metrics/score or not(css/metrics/score)]");

我知道条件表达式意义不大,我希望OP看到他们可以使用额外的条件来扩展那些要求,谢谢大家指出尽管事实我已经提到它了.. .hope我们都可以继续前进

然后,循环生成NodeList并查询entry节点的每个Node css/metrics/score。如果是null,则在列表中添加null值(或者您想要的其他内容),例如......

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document doc = dbf.newDocumentBuilder().parse(JavaApplication908.class.getResourceAsStream("/Hello.xml"));

XPathFactory xf = XPathFactory.newInstance();
XPath xPath = xf.newXPath();

XPathExpression expr1 = xPath.compile("//hello/entry[css/metrics/score or not(css/metrics/score)]");
XPathExpression expr2 = xPath.compile("css/metrics/score");

List<String> values = new ArrayList<>(25);

NodeList nodeList1 = (NodeList) expr1.evaluate(doc, XPathConstants.NODESET);
for (int index = 0; index < nodeList1.getLength(); index++) {
    Node node = nodeList1.item(index);
    System.out.println(node.getAttributes().getNamedItem("id"));

    Node css = (Node) expr2.evaluate(node, XPathConstants.NODE);
    if (css != null) {
        values.add(css.getTextContent());
    } else {
        values.add(null);
    }

}

for (String value : values) {
    System.out.println(value);
}

这输出......

id="2008-0001"
id="2008-0002"
id="2008-0003"
id="2008-0004"
3.6
5.8
null
4.3

(前四行是entry个节点id,最后四行是生成的css/metrics/score

答案 1 :(得分:0)

我不是XPath的专家,但是通过查看代码,我认为你只是缺少几行代码,

if(nodeList1.item(i)!=null)
{
   Node currentItem = nodeList1.item(i);
   if(!currentItem.getTextContent().isEmpty())
   {
     list.add(currentItem.getTextContent());
   }
   else
     list.add("");
}
else
 list.add("");

答案 2 :(得分:0)

  

@MathiasMüller你可以告诉我如何在XPath 2.0中的1表达式中完成它。 - ankit

等效的XPath 2.0表达式将是

for $x in //entry return (if ($x//*:score) then $x//*:score else '')

大量使用XPath 2.0中引入的新构造。输出将是

3.6
5.8
[Empty string]
4.3

但请注意,目前大多数XPath实现仅支持1.0。在XSLT样式表online here中尝试这个XPath 2.0表达式,这是一个使用Saxon 9.5 EE的站点。