Question

我试图获取关于某个网站的网站内容。我想在临时页面上显示结果。

我想要获取此页面： http://www.tournamentsoftware.com/sport/draw.aspx?id=600CA297-99CA-4420-AE1A-698BA10C39B0&draw=1

我想返回此页面的内容，然后使用灯具获取特定的表格。

我使用的脚本在网址存在的情况下返回404 Not found错误。

我的剧本：

function nxs_cURLTest($url, $msg, $testText){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSLVERSION,3); // Apparently 2 or 3
$response = curl_exec($ch);
$errmsg = curl_error($ch);
$cInfo = curl_getinfo($ch);
curl_close($ch);
echo "Testing ... ".$url." - ".$cInfo['url']."<br />";
if (stripos($response, $testText)!==false)
echo "....".$msg." - OK<br />";
else
{
echo "....<b style='color:red;'>".$msg." - Problem</b><br /><pre>";
print_r($errmsg);
print_r($cInfo);
print_r(htmlentities($response));
echo "</pre>There is a problem with cURL. You need to contact your server admin or hosting provider.";
}
}
nxs_cURLTest("http://www.tournamentsoftware.com/sport/draw.aspx?id=600CA297-99CA-4420-AE1A-698BA10C39B0&draw=1", "HTTPS to Toernooi.nl", 'link rel="canonical" href="http://www.tournamentsoftware.com/sport/draw.aspx?id=600CA297-99CA-4420-AE1A-698BA10C39B0&draw=1"');

任何人都可以帮我这个吗？

Answer 1

我在带有chrome的xampp win7上使用了以下内容并返回了数据

$url = 'http://www.tournamentsoftware.com/sport/draw.aspx?id=600CA297-99CA-4420-AE1A-698BA10C39B0&draw=1';
echo curl_scrap($url);
function curl_scrap($url) {
  $ch = curl_init();
  $timeout = 5;
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
  $data = curl_exec($ch);
  curl_close($ch);
  return $data;
}

它没有返回相同的来源，只是使用chrome来命中url，因为函数没有像用户代理字段等那样设置，但返回了你想要的数据，只是一个快速而肮脏的测试。

PHP Curl获取页面内容

1 个答案: