如何使用PHP Curl打开facebook,twitter页面

时间:2018-02-18 10:41:30

标签: php facebook curl instagram php-curl

当我尝试打开 url1(https://www.google.co.in url2(https://www.amazon.com url5(https://www.instagram.com它工作正常,我可以加载url1,url2和url5但是当我尝试打开 url3(https://www.facebook.com时, url4(https://www.twitter.com,它打印我的错误信息:" 错误,无法打开。"因为它无法打开facebook,twitter页面。我不想使用API​​。提前谢谢。

 <?php

    $curl = curl_init();

    //url1 = https://www.google.co.in
    //url2 = https://www.amazon.com
    //url3 = https://www.facebook.com
    //url4 = https://www.twitter.com
    //url5 = https://www.instagram.com

    $url ="https://www.facebook.com";

    curl_setopt($curl, CURLOPT_URL, $url);

    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);

    //curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 2);

    curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

    $output = curl_exec($curl);
    if($output)
    {
        echo $output;       
    }
    else
    {
        echo "Error, Unable to open.";
    }
?> 

1 个答案:

答案 0 :(得分:1)

调试此类问题时,启用CURLOPT_VERBOSE。此外,在调试时,不要使用echo,请使用var_dump。如果你这样做,你会看到像

这样的东西
* Rebuilt URL to: https://www.facebook.com/
*   Trying 157.240.20.35...
* TCP_NODELAY set
*   Trying 2a03:2880:f10a:83:face:b00c:0:25de...
* TCP_NODELAY set
* Immediate connect fail for 2a03:2880:f10a:83:face:b00c:0:25de: Network is unreachable
* Connected to www.facebook.com (157.240.20.35) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=US; ST=California; L=Menlo Park; O=Facebook, Inc.; CN=*.facebook.com
*  start date: Dec 15 00:00:00 2017 GMT
*  expire date: Mar 22 12:00:00 2019 GMT
*  subjectAltName: host "www.facebook.com" matched cert's "*.facebook.com"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 High Assurance Server CA
*  SSL certificate verify ok.
> GET / HTTP/1.1
Host: www.facebook.com
Accept: */*

< HTTP/1.1 302 Found
< Strict-Transport-Security: max-age=15552000; preload
< Location: https://www.facebook.com/unsupportedbrowser
< Content-Type: text/html; charset=UTF-8
< X-FB-Debug: x3NeeaaJHxPQkX5Z9H7yMX3evzYJocXmZpzMV6GoWtacO8bXLL3O58vidPHZUvXTuP9iE9pHPEnbr/RvNsT23Q==
< Date: Mon, 19 Feb 2018 09:12:51 GMT
< Connection: keep-alive
< Content-Length: 0
< 
* Connection #0 to host www.facebook.com left intact
string(0) ""

问题是facebook试图发出HTTP重定向(到https://www.facebook.com/unsupportedbrowser),而你没有遵循它。启用CURLOPT_FOLLOWLOCATION使curl自动处理重定向。为什么facebook重定向你?因为您没有提供任何用户代理标头。设置一个facebook将识别为CURLOPT_USERAGENT支持的一个,例如Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:52.0) Gecko/20100101 Firefox/52.0(在Windows 7 x64上运行的Firefox 52 ESR)

至于twitter.com,

* Rebuilt URL to: https://www.twitter.com/
*   Trying 104.244.42.193...
* TCP_NODELAY set
* Connected to www.twitter.com (104.244.42.193) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: businessCategory=Private Organization; jurisdictionC=US; jurisdictionST=Delaware; serialNumber=4337446; C=US; ST=California; L=San Francisco; O=Twitter, Inc.; OU=tsa_o Point of Presence; CN=twitter.com
*  start date: Jul 25 00:00:00 2017 GMT
*  expire date: Jul 30 12:00:00 2018 GMT
*  subjectAltName: host "www.twitter.com" matched cert's "www.twitter.com"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 Extended Validation Server CA
*  SSL certificate verify ok.
> GET / HTTP/1.1
Host: www.twitter.com
Accept: */*

< HTTP/1.1 301 Moved Permanently
< content-length: 0
< date: Mon, 19 Feb 2018 09:17:51 GMT
< location: https://twitter.com/
< server: tsa_o
< set-cookie: personalization_id="v1_ersTgWQIOjuJkjk6VFUlXw=="; Expires=Wed, 19 Feb 2020 09:17:51 UTC; Path=/; Domain=.twitter.com
< set-cookie: guest_id=v1%3A151903187127250514; Expires=Wed, 19 Feb 2020 09:17:51 UTC; Path=/; Domain=.twitter.com
< strict-transport-security: max-age=631138519
< x-connection-hash: aae827a6347e88db5f417a0c31bba366
< x-response-time: 101
< 
* Connection #0 to host www.twitter.com left intact
string(0) ""
  • 它试图将您重定向到该网站的非www url版本,并再次,您没有按照重定向。启用CURLOPT_FOLLOWLOCATION使curl自动遵循http重定向。