$fileSource = "http://google.com";
$ch = curl_init($fileSource);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_exec($ch);
$retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ($retcode != 200) {
$error .= "The source specified is not a valid URL.";
}
curl_close($ch);
这是我的问题。当我使用上面的内容并设置$fileSource = "http://google.com";
时,它不起作用,而如果我将其设置为$fileSource = "http://www.google.com/";
则可行。
问题是什么?
答案 0 :(得分:1)
一个永久重定向(301)到www.
域,而另一个只回复OK(200)。
为什么您只考虑200状态代码有效?让CURL为您处理:
curl_setopt($ch, CURLOPT_FAILONERROR, true);
来自manual:
如果返回的HTTP代码大于或,则为静默失败 等于400.默认行为是正常返回页面, 无视代码。
答案 1 :(得分:0)
尝试明确告诉curl遵循重定向
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
如果这不起作用,您可能需要在某些网站上欺骗用户代理。
另外,如果他们使用JS重定向你的运气不好。
答案 2 :(得分:0)
您所看到的实际上是301重定向的结果。这是我使用命令行中的详细卷曲重新获得的内容
curl -vvvvvv http://google.com
* About to connect() to google.com port 80 (#0)
* Trying 173.194.43.34...
* connected
* Connected to google.com (173.194.43.34) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.25.0 (x86_64-apple-darwin11.3.0) libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
> Host: google.com
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Location: http://www.google.com/
< Content-Type: text/html; charset=UTF-8
< Date: Fri, 04 May 2012 04:03:59 GMT
< Expires: Sun, 03 Jun 2012 04:03:59 GMT
< Cache-Control: public, max-age=2592000
< Server: gws
< Content-Length: 219
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
<
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
* Connection #0 to host google.com left intact
* Closing connection #0
但是,如果您对301重定向中建议的实际www.google.com进行了卷曲,您将获得以下内容。
curl -vvvvvv http://www.google.com
* About to connect() to www.google.com port 80 (#0)
* Trying 74.125.228.19...
* connected
* Connected to www.google.com (74.125.228.19) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.25.0 (x86_64-apple-darwin11.3.0) libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
> Host: www.google.com
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Fri, 04 May 2012 04:05:25 GMT
< Expires: -1
< Cache-Control: private, max-age=0
< Content-Type: text/html; charset=ISO-8859-1
为了显示200 OK vs 301 REDIRECT的主要区别,我已经截断了谷歌的其余回复