XMLStreamException:[row,col]处的ParseError:[5,3]消息:元素类型“meta”必须由匹配的结束标记“”终止

时间:2014-05-06 08:18:42

标签: java parsing

我试图在报纸的任何类别中显示新闻中的文字。我使用本报的RSS。但是,当我运行代码时,有时我会在上面得到异常消息,有时它可以正常工作。这是我的RSS解析器代码:

我使用的rss页面是:

RSSFeedParser parser = new RSSFeedParser("http://www.cumhuriyet.com.tr/rss/5");
Feed feed = parser.readFeed();

RSS解析器代码:

package main;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;

import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.XMLEvent;


public class RSSFeedParser {
  static final String TITLE = "title";
  static final String DESCRIPTION = "description";
  static final String CHANNEL = "channel";
  static final String LANGUAGE = "language";
  static final String COPYRIGHT = "copyright";
  static final String LINK = "link";
  static final String AUTHOR = "author";
  static final String ITEM = "item";
  static final String PUB_DATE = "pubDate";
  static final String GUID = "guid";
  static final String IMG = "img";

  final URL url;

  public RSSFeedParser(String feedUrl) {
    try {
      this.url = new URL(feedUrl);
    } catch (MalformedURLException e) {
      throw new RuntimeException(e);
    }
  }

  public Feed readFeed() {
    Feed feed = null;
    try {
      boolean isFeedHeader = true;
      // Set header values intial to the empty string
      String description = "";
      String title = "";
      String link = "";
      String language = "";
      String copyright = "";
      String author = "";
      String pubDate = "";
      String guid = "";
      String img = "";
      // First create a new XMLInputFactory
      XMLInputFactory inputFactory = XMLInputFactory.newInstance();
      // Setup a new eventReader
      InputStream in = read();
      XMLEventReader eventReader = inputFactory.createXMLEventReader(in);
      // read the XML document
      while (eventReader.hasNext()) {
        XMLEvent event = eventReader.nextEvent();
        if (event.isStartElement()) {
          String localPart = event.asStartElement().getName()
              .getLocalPart();
          switch (localPart) {
          case ITEM:
            if (isFeedHeader) {
              isFeedHeader = false;
              feed = new Feed(title, link, description, language,
                  copyright, pubDate);
            }
            event = eventReader.nextEvent();
            break;
          case TITLE:
            title = getCharacterData(event, eventReader);
            break;
          case DESCRIPTION:
            description = getCharacterData(event, eventReader);
            break;
          case LINK:
            link = getCharacterData(event, eventReader);
            break;
          case GUID:
            guid = getCharacterData(event, eventReader);
            break;
          case LANGUAGE:
            language = getCharacterData(event, eventReader);
            break;
          case AUTHOR:
            author = getCharacterData(event, eventReader);
            break;
          case PUB_DATE:
            pubDate = getCharacterData(event, eventReader);
            break;
          case COPYRIGHT:
            copyright = getCharacterData(event, eventReader);
            break;
          }
        } else if (event.isEndElement()) {
          if (event.asEndElement().getName().getLocalPart() == (ITEM)) {
            FeedMessage message = new FeedMessage();

            message.setDescription(description);
            message.setPubDate(pubDate);
            message.setLink(link);
            message.setTitle(title);
            message.setImg(img);
            feed.getMessages().add(message);
            event = eventReader.nextEvent();
            continue;
          }
        }
      }
    } catch (XMLStreamException e) {
      throw new RuntimeException(e);
    }
    return feed;
  }

  private String getCharacterData(XMLEvent event, XMLEventReader eventReader)
      throws XMLStreamException {
    String result = "";
    event = eventReader.nextEvent();
    if (event instanceof Characters) {
      result = event.asCharacters().getData();
    }
    return result;
  }

  private InputStream read() {
    try {
      return url.openStream();
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
  }
} 

异常消息是:

Exception in thread "main" java.lang.RuntimeException: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[5,3]
Message: The element type "meta" must be terminated by the matching end-tag "</meta>".
    at main.RSSFeedParser.readFeed(RSSFeedParser.java:112)
    at cumhuriyet.Dunya.cumDunya(Dunya.java:32)
    at automation.ServerInteraction.main(ServerInteraction.java:83)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[5,3]
Message: The element type "meta" must be terminated by the matching end-tag "</meta>".
    at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(Unknown Source)
    at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(Unknown Source)
    at main.RSSFeedParser.readFeed(RSSFeedParser.java:58)
    ... 2 more

1 个答案:

答案 0 :(得分:0)

看起来它是一个真人文件;即一个经常变化的人。其中也没有标记的迹象。

我可以想到发生了什么的两种解释:

  
      
  1. 有时文档生成或创建不正确。

  2.   
  3. 有时您会收到HTML错误页面而不是文档   你期待的,XML解析器无法处理标签   HTML&#39>。

  4.   

要跟踪此情况,您将不得不捕获导致解析失败的精确输入。