XML解析错误:获取RSS提要时文档元素之后的垃圾

时间:2013-10-09 03:12:31

标签: php xml ajax rss

我正在尝试使用Ajax从rss.news.yahoo.com/rss/topstories获取RSS提要文档,提取与“titles”标记关联的值并将其回显到屏幕。

xmlget.htm 通过GET请求实现Ajax。

xmlget.php 使用PHP函数file_get_contents在网页中加载GET变量$ _GET ['url']中提供给它的URL并显示'屏幕上的标题'标签。

我得到的错误是:

XML Parsing Error: junk after document element Location: moz-nullprincipal:{2f186a54-8730-4ead-9bf9-f82c8d56ad8f} Line Number 2, Column 1:

xmlget.htm

<html>
<head>
    <title>Ajax Example</title>
</head>
<body>
    <h1 style="text-align: center;">Loading a web page into a DIV</h1>
    <div id='info'>This sentence will be replaced</div>
<script>
    nocache = "&nocache="+Math.random()*1000000
    url = "rss.news.yahoo.com/rss/topstories"
    request = new ajaxRequest()
    request.open("GET","xmlget.php?url="+url+nocache,true)
    out = "";

    request.onreadystatechange = function(){
        if(this.readyState == 4){
            if(this.status == 200){
                if(this.responseXML != ""){
                    titles = this.responseXML.getElementsByTagName('title')
                    for (j = 0 ; j < titles.length ; ++j)
                        out += titles[j].childNodes[0].nodeValue + '<br />'
                    document.getElementById('info').innerHTML = out
                }
                else alert("Ajax error: No data received")
            }
            else alert( "Ajax error: " + this.statusText)
        }    
    }

    request.send(null)

    function ajaxRequest(){
        try{
            request = new ActiveXObject("Microsoft.XMLHTTP");
        } catch (e1){
            try{
                request = new ActiveXObject("Msxml2.XMLHTTP");
            } catch (e2){
                try{
                    request = new XMLHttpRequest()
                } catch (e3){
                    request = false
                }
            }
        }
        return request
    }
</script>
</body>

xmlget.php

<?php
if(isset($_GET['url'])){

    function SanitizeString($var) {
        $var = strip_tags($var);
        $var = htmlentities($var);
        return stripcslashes($var);
    }

    header('Content-Type: text/xml');
    echo file_get_contents("http://www.".SanitizeString($_GET['url']));

}

&GT;

2 个答案:

答案 0 :(得分:0)

请在头部添加以下行

<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />

答案 1 :(得分:0)

发现问题了! file_get_contents 函数无法找到主机,因为该网址无效。 ROFL ...

不正确

echo file_get_contents("http://www.".SanitizeString($_GET['url']));

<强> CORRECT

echo file_get_contents("http://".SanitizeString($_GET['url']));