使用java进行xml解析的自闭标签

时间:2014-07-23 06:37:00

标签: java xml xml-parsing

我试图解析xml文件以从rss feed获取新闻。但是自动关闭,空标签无法正确解析。有些项目没有描述标签。

我的xml看起来像这样

<rss version="2.0">
  <channel>
    <title>Tata Group</title>
    <link>http://www.tata.com</link>
    <description>
    Tata is a rapidly growing business group based in India with significant international operations. The business operations of the Tata Group currently encompass seven business sectors: communications and information technology, engineering, materials, services, energy, consumer products and chemicals.
    </description>
    <copyright>Copyright (C) 2014 Tata Sons Ltd</copyright>
    <item>
      <title>Tata Power commissions waste water recovery plant at power house #6, Jamshedpur
      </title>
      <link>
        http://www.tata.com/rssread.aspx?artid=tiEeXsbwZ54=
      </link>
      <description>Jamshedpur: Tata Power, India's largest integrated power company, has adopted several innovative technological solutions to improve the plant processes at its generation faciliti...
      </description>
      <pubDate>22 Jul 2014 12:00:00 GMT</pubDate>
    </item>
    </channel>
    </rss>

目前我得到的是,我可以看到所有具有非空描述标签的项目以及所有具有空自关闭描述标签的项目都会被跳过。

try {
        XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
        XmlPullParser xpp = factory.newPullParser();
        FileReader xmlReader = new FileReader(destination);
        xpp.setInput(xmlReader);
        int eventType = xpp.getEventType();
        String NodeValue;
        while (eventType != XmlPullParser.END_DOCUMENT) {
            switch (eventType) {
            case XmlPullParser.START_DOCUMENT:
                break;
            case XmlPullParser.START_TAG:
                NodeValue = xpp.getName();// Start of a Node
                if (NodeValue.equalsIgnoreCase("item")) {
                    flagItem = true;
                } else if (NodeValue.equalsIgnoreCase("title") && flagItem) {
                    eventType = xpp.next();
                    if (eventType == XmlPullParser.TEXT) {
                        message.setTitle(xpp.getText());
                    }
                } else if (NodeValue.equalsIgnoreCase("description/") && flagItem) {
                    message.setDescription("Description not available..");
                    Log.out(logFlag, logTag, "Reaching the critical point...........  self closing tag reached!!!");
                    flagItem = false;
                    list.add(message);
                    message = null;
                    message = new Message();

                } else if (NodeValue.equalsIgnoreCase("description") && flagItem) {
                    eventType = xpp.next();
                    if (eventType == XmlPullParser.TEXT) {
                        message.setDescription(xpp.getText());
                        flagItem = false;
                        list.add(message);
                        message = null;
                        message = new Message();
                    }
                }
                break;
            }
            eventType = xpp.next();
            Log.out(logFlag, logTag, "xml file downloaded : "+list);

1 个答案:

答案 0 :(得分:0)

添加

case XmlPullParser.END_TAG:
    // ...
    break;

对于一般情况,您可能必须跟踪“打开”元素。但由于您只对<description>感兴趣,因此当您看到START_TAG时,您可以使用标记,设置并清除。像<description/>这样的空元素会被报告为START_TAG,然后是END_TAG。

这肯定是不正确的,因为肯定'/'不是本地名称的一部分:

NodeValue.equalsIgnoreCase("description/")

从字符串中省略'/'。

稍后我注意到了equalsIgnoreCase方法调用。您确实知道XML是区分大小写的w.r.t.元素名称?!