Question

我有一个使用curl的脚本来获取网页的html。有时它会完美地获取信息，而有时它似乎会挂起。我提出了超时规定 -

curl_setopt($ch, CURLOPT_TIMEOUT, 10);

所以现在脚本不再挂起，但是当它超时时，它不会返回任何html。有没有办法让curl在超时之前获得所有html？或者，是否有其他方法可以实现这个想法 - “从URL中获取指定时间段内的所有html”？

Answer 1

使用CURLOPT_FILE

示例：

<?php
$ch = curl_init("http://www.example.com/");
$fp = fopen("/path/to/save/file", "w");

curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);

echo file_get_contents("/path/to/save/file");
?>

Answer 2

使用流包装器，您甚至可以即时解析数据。看看这个： Manipulate a string that is 30 million characters long

如何在指定时间内获取curl请求的HTML

2 个答案: