XML解析器提供null元素

时间:2011-01-01 21:43:37

标签: java xml xml-parsing

当我尝试解析XML文件时,它有时会通过标题提供null元素。 我认为这与HTML标签'

有关

我该如何解决这个问题?

我有以下XML文件:

<item>
<title>&#039; Nieuwe DVD &#039;</title>
<description>tekst, tekst tekst</description>
<link>dvd.html</link>
<category>nieuws</category>
<pubDate>Sat, 1 Jan 2011 9:24:00 +0000</pubDate>
</item>

以下代码解析xml文件:

//DocumentBuilderFactory, DocumentBuilder are used for 
      //xml parsing
      DocumentBuilderFactory dbf = DocumentBuilderFactory
        .newInstance();
      DocumentBuilder db = dbf.newDocumentBuilder();

      //using db (Document Builder) parse xml data and assign
      //it to Element
      Document document = db.parse(is);
      Element element = document.getDocumentElement();

      //take rss nodes to NodeList
      element.normalize();

      NodeList nodeList = element.getElementsByTagName("item");

      if (nodeList.getLength() > 0) 
      {
       for (int i = 0; i < nodeList.getLength(); i++) 
       {
        //take each entry (corresponds to <item></item> tags in 
        //xml data

        Element entry = (Element) nodeList.item(i);
        entry.normalize();
        Element _titleE = (Element) entry.getElementsByTagName(
          "title").item(0);

        Element _categoryE = (Element) entry
          .getElementsByTagName("category").item(0);
        Element _pubDateE = (Element) entry
          .getElementsByTagName("pubDate").item(0);
        Element _linkE = (Element) entry.getElementsByTagName(
          "link").item(0);

        String _title = _titleE.getFirstChild().getNodeValue();
        String _category = _categoryE.getFirstChild().getNodeValue();
        Date _pubDate = new Date(_pubDateE.getFirstChild().getNodeValue());
        String _link = _linkE.getFirstChild().getNodeValue();

        //create RssItemObject and add it to the ArrayList
        RssItem rssItem = new RssItem(_title, _category, _pubDate, _link);

        rssItems.add(rssItem);
        conn.disconnect();
       }

1 个答案:

答案 0 :(得分:0)

当您真正想要 getTextContent 时,请不要使用 getFirstElement