Question

我通过PHP（简单的html dom）中的CURL从网站获取内容。

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
echo str_get_html($output);

它在顶部提供了这个html，然后是页面的其余部分html

<html><head><title>Object moved</title></head><body>  <h2>Object moved to <a href="/LocationSelection.aspx">here</a>.</h2>  </body></html>
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN".........

我不想先得到html。我想从<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

开始获取HTML

如何使用CURL进行操作？还有其他方法吗？

编辑：我们可以在CURL中做任何延迟，以便首先通过ajax加载整个html然后。就像我们使用 sleep（10）

Answer 1

您想要恢复第二个HTML，因此只需将其添加到curl选项：

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

然后，您将在没有Locations.aspx

的情况下恢复Object moved.....

Answer 2

您的要求：

novocinemas.com/Home.aspx

我刚刚在Chrome中运行并获得了302状态，然后重定向发生了：

Home.aspx   GET 302 text/html   Other   260 B   1.25 s  
LocationSelection.aspx  GET 200 text/html   http://novocinemas.com/Home.aspx    2.2 KB  705 ms

由于萨蒂亚德普

Answer 3

从CURL收到输出后，如果使用下面的代码将第一个html替换为空字符串怎么样？

$pattern = '/<html>.*<\/html>/i';
$replace = preg_replace($pattern, '', $outputFromCurl);

echo htmlentities($replace);

然后你会得到第二个html

希望有所帮助

CURL在PHP中获取两个html页面

3 个答案: