无法使用PHP cURL从网站中提取数据

时间:2018-03-13 04:25:51

标签: php curl

我正在开发一个需要从其他网页获取数据的项目: https://eth.ethfans.org/#/miner?0x2998850087633a4806191960c94ed535d97da598

我正在尝试使用函数cRUL:

<?php

$url = "https://eth.ethfans.org/#/miner?0x2998850087633a4806191960c94ed535d97da598";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);

$contents = curl_exec($ch);
curl_close($ch);
echo $contents;
?>

但是,我只能获得网站的布局,但我无法获取内部数据。

有人可以为此提供帮助吗?

先谢谢。

此致 亚历

2 个答案:

答案 0 :(得分:0)

使用str_get_html从布局中获取数据:

$get_html = str_get_html($contents);

示例:

function check()
  { 
    $url = "https://stackoverflow.com/questions/49248329/cannot-extract-the-data-from-the-website-using-php-curl";

    $get_html = $this->get_curl($url); 
    #print_r($get_html); exit;
    $get_html = str_get_html($get_html);

        $fb = NULL; 
        foreach ($get_html->find('a') as $v) { // you can get what data from the layout

          if(strpos($v->href, 'facebook'))
          {
            echo $fb = $v->href;
            echo "\n";
            break;
          } 
        }
      unset($get_html);

  }

public function get_curl($url)
  {
    ob_start();

    $ch = curl_init($url);

    $headers = [
       'Accept-Language: en-US,en;q=0.5',
       'Cache-Control: no-cache',                      
       'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:28.0) Gecko/20100101 Firefox/51.0',                       
    ];

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    curl_setopt($ch, CURLOPT_AUTOREFERER, true);

    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

    curl_setopt($ch,CURLOPT_URL, $url);

    $response = curl_exec ($ch);

    curl_close ($ch);

    ob_end_flush();

    return $response;
  }

答案 1 :(得分:0)

您正在点击错误的网址,您点击的网页只包含获取实际数据所需的布局和javascript,然后javascript从https://eth.ethfans.org/api/page/miner?value=2998850087633a4806191960c94ed535d97da598获取数据,因此,像javascript那样做,并获取该网址。