Android:RSS解析在特殊字符处停止

时间:2011-11-02 20:32:39

标签: android parsing rss special-characters

我已经搜索了很多,但还没有找到解决方案,为什么我的RSS阅读器会停留在像æøå等特殊字符中。 读者读取Feed直到它遇到一个特殊字符 - 然后它停止读取该元素并继续下一个元素。 因此,当我在我的应用程序中显示新闻时,我的文字被特殊字符切断,这非常烦人! 当然它与编码有关,但我无法弄清楚如何处理我的代码。

此代码适用于其他Feed,如http://www.fyens.dk/rss/sport,它采用iso-8859-1编码。使用此Feed可以显示特殊字符,没有任何问题。但是,如果我尝试像http://ob.dk/forum/rss.aspx?ForumID=3&Mode=0这样的UTF-8这样的问题,就会出现问题。

有关如何解决此问题的任何建议?

    try {
        //open an URL connection make GET to the server and 
        //take xml RSS data
        URL url = new URL("http://ob.dk/forum/rss.aspx?ForumID=3&Mode=0");
        HttpURLConnection conn = (HttpURLConnection) url.openConnection();

        if (conn.getResponseCode() == HttpURLConnection.HTTP_OK) {
            InputStream is = conn.getInputStream();

            //DocumentBuilderFactory, DocumentBuilder are used for 
            //xml parsing
            DocumentBuilderFactory dbf = DocumentBuilderFactory
                    .newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();



            //using db (Document Builder) parse xml data and assign
            //it to Element
            Document document = db.parse(is);
            Element element = document.getDocumentElement();

            //take rss nodes to NodeList
            NodeList nodeList = element.getElementsByTagName("item");

            if (nodeList.getLength() > 0) {
                for (int i = 0; i < nodeList.getLength(); i++) {

                    //take each entry (corresponds to <item></item> tags in 
                    //xml data

                    Element entry = (Element) nodeList.item(i);

                    Element _titleE = (Element) entry.getElementsByTagName(
                            "title").item(0);
                    Element _descriptionE = (Element) entry
                            .getElementsByTagName("description").item(0);
                    Element _pubDateE = (Element) entry
                            .getElementsByTagName("pubDate").item(0);
                    Element _linkE = (Element) entry.getElementsByTagName(
                            "link").item(0);

                    String _title = _titleE.getFirstChild().getNodeValue();
                    String _description = _descriptionE.getFirstChild().getNodeValue();
                    Date _pubDate = new Date(_pubDateE.getFirstChild().getNodeValue());
                    String _link = _linkE.getFirstChild().getNodeValue();

                    int time = _pubDate.getHours()-2;

                    _pubDate.setHours(time);

                            RssItem rssItem = new RssItem("OB.dk: "+_title, _description,
                                    _pubDate, "http://www.google.com/gwt/x?u="+_link);

                            rssItems.add(rssItem);

                    }



            }

        }
    } catch (Exception e) {
        e.printStackTrace();
    }

1 个答案:

答案 0 :(得分:1)