curl在yahoo.co.jp

时间:2019-02-01 02:45:44

标签: php curl

我尝试从Yahoo Japan拍卖的页面上获得卖家的名字,直到1年前它一直有效,然后突然停止工作。

下面的代码仅是为了能够获得拍卖页。 之后,我将使用pregmatch获得所需的信息。

任何帮助都将受到欢迎,我一直在寻找没有任何解决方案的月份。预先谢谢你。

    <html>
    <head><title>Get info</title>
    <!--meta http-equiv="Content-Type" content="text/plain;charset=utf-8"/-->
    </head>
    <body>

    <?php
    $link="https://page.auctions.yahoo.co.jp/jp/auction/c713387584";
    $agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0)         Gecko/20100101 Firefox/61.0";

    $fp = fopen("cookie.txt", "w");
        $curl = curl_init();
        curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
        curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
        curl_setopt($curl, CURLOPT_URL, $link);
        curl_setopt($curl, CURLOPT_COOKIEJAR, "cookie.txt");
        curl_setopt($curl, CURLOPT_COOKIEFILE, "cookie.txt"); 
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($curl, CURLOPT_USERAGENT, $agent); 
        curl_setopt($curl, CURLOPT_VERBOSE, 1);
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($curl, CURLOPT_AUTOREFERER, false);
        curl_setopt($curl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
        curl_setopt($curl, CURLOPT_HEADER, 0);

        $result = curl_exec ($curl);
        curl_close ($curl);
        print $result;


        fclose($fp);

    unlink("cookie.txt");
    ?>
    </body>
    </html>

`

1 个答案:

答案 0 :(得分:1)

您的问题很可能是由于curl / openssl太旧(或使用curl编译的任何SSL后端)引起的。

这是我从命令行中得到的:

$ curl --silent --verbose >/dev/null --http1.1 --tls-max 1.1 --cookie-jar dummy.txt https://page.auctions.yahoo.co.jp/jp/auction/c713387
*   Trying 183.79.250.251...
* TCP_NODELAY set
* Connected to page.auctions.yahoo.co.jp (183.79.250.251) port 443 (#0)
...
* TLSv1.1 (OUT), TLS handshake, Client hello (1):
} [148 bytes data]
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to page.auctions.yahoo.co.jp:443 
* Closing connection 0

$ curl --silent --verbose >/dev/null --http1.1 --tls-max 1.2 --cookie-jar dummy.txt https://page.auctions.yahoo.co.jp/jp/auction/c713387 
*   Trying 183.79.250.251...
* TCP_NODELAY set
* Connected to page.auctions.yahoo.co.jp (183.79.250.251) port 443 (#0)
...
< HTTP/1.1 404 Not Found
< Cache-Control: private
< Content-Type: text/html; charset=utf-8
...
* Connection #0 to host page.auctions.yahoo.co.jp left intact

将此与SO进行比较:

$ curl --silent --verbose >/dev/null --http1.1 --tls-max 1.1 https://stackoverflow.com/  
*   Trying 151.101.65.69...
* TCP_NODELAY set
* Connected to stackoverflow.com (151.101.65.69) port 443 (#0)
...
< HTTP/1.1 200 OK
< Cache-Control: private
< Content-Type: text/html; charset=utf-8
...
* Connection #0 to host stackoverflow.com left intact

简而言之:yahoo.co.jp仅接受至少使用TLS 1.2的客户端,因此允许较老的客户端。