我试图获取关于某个网站的网站内容。我想在临时页面上显示结果。
我想要获取此页面: http://www.tournamentsoftware.com/sport/draw.aspx?id=600CA297-99CA-4420-AE1A-698BA10C39B0&draw=1
我想返回此页面的内容,然后使用灯具获取特定的表格。
我使用的脚本在网址存在的情况下返回404 Not found错误。
我的剧本:
function nxs_cURLTest($url, $msg, $testText){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSLVERSION,3); // Apparently 2 or 3
$response = curl_exec($ch);
$errmsg = curl_error($ch);
$cInfo = curl_getinfo($ch);
curl_close($ch);
echo "Testing ... ".$url." - ".$cInfo['url']."<br />";
if (stripos($response, $testText)!==false)
echo "....".$msg." - OK<br />";
else
{
echo "....<b style='color:red;'>".$msg." - Problem</b><br /><pre>";
print_r($errmsg);
print_r($cInfo);
print_r(htmlentities($response));
echo "</pre>There is a problem with cURL. You need to contact your server admin or hosting provider.";
}
}
nxs_cURLTest("http://www.tournamentsoftware.com/sport/draw.aspx?id=600CA297-99CA-4420-AE1A-698BA10C39B0&draw=1", "HTTPS to Toernooi.nl", 'link rel="canonical" href="http://www.tournamentsoftware.com/sport/draw.aspx?id=600CA297-99CA-4420-AE1A-698BA10C39B0&draw=1"');
任何人都可以帮我这个吗?
答案 0 :(得分:0)
我在带有chrome的xampp win7上使用了以下内容并返回了数据
$url = 'http://www.tournamentsoftware.com/sport/draw.aspx?id=600CA297-99CA-4420-AE1A-698BA10C39B0&draw=1';
echo curl_scrap($url);
function curl_scrap($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
它没有返回相同的来源,只是使用chrome来命中url,因为函数没有像用户代理字段等那样设置,但返回了你想要的数据,只是一个快速而肮脏的测试。