我被要求从某个页面获取某一行,但似乎该网站已阻止了CURL请求?
相关网站为http://www.habbo.com/home/Intricat
我尝试更改UserAgent以查看它们是否阻止了它,但它似乎没有做到这一点。
我使用的代码如下:
<?php
$curl_handle=curl_init();
//This is the URL you would like the content grabbed from
curl_setopt($curl_handle, CURLOPT_USERAGENT, "Mozilla/5.0");
curl_setopt($curl_handle,CURLOPT_URL,'http://www.habbo.com/home/Intricat');
//This is the amount of time in seconds until it times out, this is useful if the server you are requesting data from is down. This way you can offer a "sorry page"
curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT,2);
curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,1);
$buffer = curl_exec($curl_handle);
//This Keeps everything running smoothly
curl_close($curl_handle);
// Change the message bellow as you wish, please keep in mind you must have your message within the " " Quotes.
if (empty($buffer))
{
print "Sorry, It seems our weather resources are currently unavailable, please check back later.";
}
else
{
print $buffer;
}
?>
有关其他方式的想法,如果他们阻止了CURL请求,我可以从该页面获取一行代码吗?
编辑:在通过我的服务器运行curl -i时,该网站似乎首先设置了一个cookie?
答案 0 :(得分:1)
进入浏览器并复制正在发送的标题, 该网站无法告诉您正在尝试卷曲,因为请求看起来完全一样。 如果使用cookie - 将它们作为标题附加。
答案 1 :(得分:1)
这是我几年前做过的Curl课程的剪切和粘贴,希望你能为自己选择一些宝石。
function get_url($url)
{
curl_setopt ($this->ch, CURLOPT_URL, $url);
curl_setopt ($this->ch, CURLOPT_USERAGENT, $this->user_agent);
curl_setopt ($this->ch, CURLOPT_COOKIEFILE, $this->cookie_name);
curl_setopt ($this->ch, CURLOPT_COOKIEJAR, $this->cookie_name);
if(!is_null($this->referer))
{
curl_setopt ($this->ch, CURLOPT_REFERER, $this->referer);
}
curl_setopt ($this->ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt ($this->ch, CURLOPT_HEADER, 0);
if($this->follow)
{
curl_setopt ($this->ch, CURLOPT_FOLLOWLOCATION, 1);
}
else
{
curl_setopt ($this->ch, CURLOPT_FOLLOWLOCATION, 0);
}
curl_setopt ($this->ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($this->ch, CURLOPT_HTTPHEADER, array("Accept: text/html,text/vnd.wap.wml,*.*"));
curl_setopt ($this->ch, CURLOPT_SSL_VERIFYPEER, FALSE); // this line makes it work under https
$try=0;
$result="";
while( ($try<=$this->retry_attempts) && (empty($result)) ) // force a retry upto 5 times
{
$try++;
$result = curl_exec($this->ch);
$this->response=curl_getinfo($this->ch);
// $response['http_code'] 4xx is an error
}
// set refering URL to current url for next page.
if($this->referer_to_last) $this->set_referer($url);
return $result;
}
答案 2 :(得分:1)
你对谈论的那种障碍并不是很具体。有问题的网站http://www.habbo.com/home/Intricat
首先检查浏览器是否启用了javascript:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta http-equiv="Content-Script-Type" content="text/javascript">
<script type="text/javascript">function setCookie(c_name, value, expiredays) {
var exdate = new Date();
exdate.setDate(exdate.getDate() + expiredays);
document.cookie = c_name + "=" + escape(value) + ((expiredays == null) ? "" : ";expires=" + exdate.toGMTString()) + ";path=/";
}
function getHostUri() {
var loc = document.location;
return loc.toString();
}
setCookie('YPF8827340282Jdskjhfiw_928937459182JAX666', '179.222.19.192', 10);
setCookie('DOAReferrer', document.referrer, 10);
location.href = getHostUri();</script>
</head>
<body>
<noscript>This site requires JavaScript and Cookies to be enabled. Please change your browser settings or upgrade your
browser.
</noscript>
</body>
</html>
由于curl没有javascript支持,您需要使用具有以下功能的HTTP客户端 - 或者您需要模仿该脚本并创建自己的cookie和新请求URI。
答案 3 :(得分:0)
我知道这是一篇非常古老的帖子,但是因为今天我必须回答同样的问题,所以我在这里分享它,对于他们来说,它可能对他们有用。我也完全清楚OP特别要求curl
,但是 - 就像我一样 - 可能有人对解决方案感兴趣,无论是否curl
。
我想通过curl
阻止它的页面。如果该块不,因为javascript
,但由于代理(我的情况,并且在curl
设置代理没有帮助),那么{ {1}}可能是一个解决方案:
wget
答案 4 :(得分:0)
您可以使用'wget'来使用shell ..
访问此内容function wget($url){
//get contnet with wget since some sites are not allowed with curl or file_get_content requests
$content=`wget -O - $url`;
return $content;
}