我正在尝试使用代理获取页面的源代码。它一直工作到我循环遍历网址并抓取源代码。但是,一旦我尝试循环代理,它就会变慢并超时。我没有收到错误消息,它只是继续工作。这是代理问题还是代码问题?我是PHP的新手,所以非常感谢任何帮助。
您可以在pelican-cement.com/bbb.html上看到问题。这个项目是试图从某些页面中抓取数据,但我们大约只有一半。这是代码:
<html>
<body>
<?
$urls=explode("\n", $_POST['url']);
$proxies=explode("\n", $_POST['proxy']);
for ( $counter = 0; $counter <= 6; $counter++) {
for ( $count = 0; $count <= 6; $counter++) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$urls[$counter]);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 0);
curl_setopt($ch, CURLOPT_PROXY,$proxies[$count]);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST,'GET');
curl_setopt ($ch, CURLOPT_HEADER, 1);
curl_exec ($ch);
$curl_scraped_page = curl_exec($ch);
$FileName = time();
$FileHandle = fopen($FileName, 'w') or die("can't open file");
fwrite($FileHandle, $curl_scraped_page);
$hostname="***";
$username="****";
$password="****";
$dbname="****";
$usertable="****";
$con=mysql_connect($hostname,$username, $password) or die ("<html><script language='JavaScript'>alert('Unable to connect to database! Please try again later.'),history.go(-1)</script></html>");
mysql_select_db($dbname ,$con);
$sql="INSERT INTO **** (time, ad1)
VALUES
('$FileName','$domains')";
if (!mysql_query($sql,$con))
{
die('Error: ' . mysql_error());
}
echo "1 record added";
mysql_close($con);
fclose($FileHandle);
curl_close($ch);
echo $FileName;
echo "<br/>";
sleep(1);
}
}
?>
</body>
</html>
答案 0 :(得分:0)
如果您是通过浏览器运行此功能,那么您就会遇到超时:http://www.php.net/manual/en/info.configuration.php#ini.max-execution-time
您可以通过CLI模式运行它以避免达到超时