无法从亚马逊抓取搜索内容

时间:2020-03-16 07:34:46

标签: php php-curl

<?php

$curl=curl_init();
$string_name="php books";
$url="https://www.amazon.in/s/field-keywords=$string_name";
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_exec($curl);
curl_close($curl);

?>

Blockquote

根据上面的代码,我正在尝试卷曲亚马逊网站上的“ php书籍” 它显示“无法使用必需的安全协议进行连接。要访问请求的页面,请 升级或使用其他浏览器或移动设备,以确保您在Amazon上的体验会 不间断。”-任何摆脱这个问题的想法。

Amazon error page

1 个答案:

答案 0 :(得分:0)

使用我在注释中提到的选项(以及其他一些选项),此有用的curl函数应有助于解决此问题。如果您需要覆盖默认选项,只需在$options数组参数中使用不同的值即可。同样,您可以根据需要添加自定义标题。

<?php

    function curl( $url=NULL, $options=NULL, $headers=false ){
        /*
            Download a copy of `cacert.pem` from here
            https://curl.haxx.se/docs/caextract.html

            copy to webserver somewhere and modify below path
            to suit your environment
        */  
        $cacert='c:/wwwroot/cacert.pem';
        $vbh = fopen('php://temp', 'w+');


        session_write_close();

        /* Initialise curl request object */
        $curl=curl_init();
        if( parse_url( $url,PHP_URL_SCHEME )=='https' ){
            curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, true );
            curl_setopt( $curl, CURLOPT_SSL_VERIFYHOST, 2 );
            curl_setopt( $curl, CURLOPT_CAINFO, $cacert );
        }
        /* Define standard options */
        curl_setopt( $curl, CURLOPT_URL,trim( $url ) );
        curl_setopt( $curl, CURLOPT_AUTOREFERER, true );
        curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
        curl_setopt( $curl, CURLOPT_FAILONERROR, true );
        curl_setopt( $curl, CURLOPT_HEADER, false );
        curl_setopt( $curl, CURLINFO_HEADER_OUT, false );
        curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
        curl_setopt( $curl, CURLOPT_BINARYTRANSFER, true );
        curl_setopt( $curl, CURLOPT_CONNECTTIMEOUT, 20 );
        curl_setopt( $curl, CURLOPT_TIMEOUT, 60 );
        curl_setopt( $curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36' );
        curl_setopt( $curl, CURLOPT_MAXREDIRS, 10 );
        curl_setopt( $curl, CURLOPT_ENCODING, '' );
        curl_setopt( $curl, CURLOPT_VERBOSE, true );
        curl_setopt( $curl, CURLOPT_NOPROGRESS, true );
        curl_setopt( $curl, CURLOPT_STDERR, $vbh );
        /* Assign runtime parameters as options */
        if( isset( $options ) && is_array( $options ) ){
            foreach( $options as $param => $value ) curl_setopt( $curl, $param, $value );
        }
        if( $headers && is_array( $headers ) ){
            curl_setopt( $curl, CURLOPT_HTTPHEADER, $headers );
        }
        /* Execute the request and store responses */
        $res=(object)array(
            'response'  =>  curl_exec( $curl ),
            'info'      =>  (object)curl_getinfo( $curl ),
            'errors'    =>  curl_error( $curl )
        );
        rewind( $vbh );
        $res->verbose=stream_get_contents( $vbh );
        fclose( $vbh );
        curl_close( $curl );
        return $res;
    }

    $keyword='books';
    $baseurl='https://www.amazon.in/s/field-keywords';

    $url=sprintf('%s=%s',$baseurl,$keyword);
    $res=curl( $url );
    if( $res->info->http_code==200 ){
        echo $res->response;
    }
?>