Ticky rss用curl阅读

时间:2013-08-23 20:02:32

标签: php curl rss rss-reader

是否有任何人都知道要使用curl读取此Feed http://maxhire.net/cp/?EA5E6F361D4364703D044F72? 我明显错过了一些curl conf,但我是新手,通常做JS

function url_get_contents ($Url) {

if (!function_exists('curl_init')){ 
    die('CURL is not installed!');
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $Url);
curl_setopt($ch, CURLOPT_POST, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$output = curl_exec($ch);
curl_close($ch);
return $output;
}

调用脚本如下,

echo url_get_contents('http://maxhire.net/cp/?EA5E6F361D4364703D044F72');

不适用于此FEED并与其他任何人合作,例如http:/ / x ml.corriereobjects.it/rss/homepage.xml

1 个答案:

答案 0 :(得分:4)

这个网站似乎期待一个名为AspxAutoDetectCookieSupport的cookie,如果它没有找到它会将你重定向到某个cookie检测页面,它会陷入循环:

> curl -I -L http://maxhire.net/cp/?EA5E6F361D4364703D044F72
HTTP/1.1 302 Found
Date: Fri, 23 Aug 2013 23:10:55 GMT
Server: Microsoft-IIS/6.0
P3P: CP="CAO PSA OUR"
X-Powered-By: ASP.NET
X-AspNet-Version: 4.0.30319
Location: /cp/?EA5E6F361D4364703D044F72&AspxAutoDetectCookieSupport=1
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 180
Connection: Keep-Alive
Set-Cookie: AspxAutoDetectCookieSupport=1; path=/

HTTP/1.1 302 Found
Date: Fri, 23 Aug 2013 23:10:56 GMT
Server: Microsoft-IIS/6.0
P3P: CP="CAO PSA OUR"
X-Powered-By: ASP.NET
X-AspNet-Version: 4.0.30319
Location: /cp/?EA5E6F361D4364703D044F72&AspxAutoDetectCookieSupport=1
&AspxAutoDetectCookieSupport=1
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 214
Connection: Keep-Alive
Set-Cookie: AspxAutoDetectCookieSupport=1; path=/

HTTP/1.1 302 Found
Date: Fri, 23 Aug 2013 23:10:57 GMT
Server: Microsoft-IIS/6.0
P3P: CP="CAO PSA OUR"
X-Powered-By: ASP.NET
X-AspNet-Version: 4.0.30319
Location: /cp/?EA5E6F361D4364703D044F72&AspxAutoDetectCookieSupport=1
&AspxAutoDetectCookieSupport=1&AspxAutoDetectCookieSupport=1
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 248
Connection: Keep-Alive
Set-Cookie: AspxAutoDetectCookieSupport=1; path=/


^C

所以你需要设置这个cookie:AspxAutoDetectCookieSupport=1

curl_setopt($ch, CURLOPT_COOKIE, 'AspxAutoDetectCookieSupport=1');

解决了第一个问题,出现了另一个问题,如果你没有为用户代理设置一个值,它会发送给你这个页面:

<html xmlns:atom="http://www.w3.org/2005/Atom">
<head><meta http-equiv="Content-Type" content="text/xml; charset=iso-8859-1" /><
title>
        Untitled Page
</title><link href="App_Themes/Default/Common.css" type="text/css" rel="styleshe
et" /><link href="App_Themes/Default/Container.css" type="text/css" rel="stylesh
eet" /><link href="App_Themes/Default/Content.css" type="text/css" rel="styleshe
et" /><link href="App_Themes/Default/Login.css" type="text/css" rel="stylesheet"
 /></head>
<body>
    <form name="form1" method="post" action="rssCurrentJobs.aspx?site=5E6F361D43
64703D044F72" id="form1">
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTc2MTg4
NDc4NmRk" />

    <div>

    </div>
    </form>
</body>
</html>

因此添加用户代理值:

curl_setopt($ch, CURLOPT_USERAGENT, "SomeUserAgent");

完整代码:

function url_get_contents ($Url) {

    if (!function_exists('curl_init')){ 
        die('CURL is not installed!');
    }

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $Url);
    curl_setopt($ch, CURLOPT_POST, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_USERAGENT, "SomeUserAgent");
    curl_setopt($ch, CURLOPT_COOKIE, 'AspxAutoDetectCookieSupport=1');
    $output = curl_exec($ch);
    curl_close($ch);
    return $output;
}