我正在开发一个需要从其他网页获取数据的项目: https://eth.ethfans.org/#/miner?0x2998850087633a4806191960c94ed535d97da598
我正在尝试使用函数cRUL:
<?php
$url = "https://eth.ethfans.org/#/miner?0x2998850087633a4806191960c94ed535d97da598";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$contents = curl_exec($ch);
curl_close($ch);
echo $contents;
?>
但是,我只能获得网站的布局,但我无法获取内部数据。
有人可以为此提供帮助吗?
先谢谢。
此致 亚历
答案 0 :(得分:0)
使用str_get_html从布局中获取数据:
$get_html = str_get_html($contents);
示例:
function check()
{
$url = "https://stackoverflow.com/questions/49248329/cannot-extract-the-data-from-the-website-using-php-curl";
$get_html = $this->get_curl($url);
#print_r($get_html); exit;
$get_html = str_get_html($get_html);
$fb = NULL;
foreach ($get_html->find('a') as $v) { // you can get what data from the layout
if(strpos($v->href, 'facebook'))
{
echo $fb = $v->href;
echo "\n";
break;
}
}
unset($get_html);
}
public function get_curl($url)
{
ob_start();
$ch = curl_init($url);
$headers = [
'Accept-Language: en-US,en;q=0.5',
'Cache-Control: no-cache',
'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:28.0) Gecko/20100101 Firefox/51.0',
];
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch,CURLOPT_URL, $url);
$response = curl_exec ($ch);
curl_close ($ch);
ob_end_flush();
return $response;
}
答案 1 :(得分:0)
您正在点击错误的网址,您点击的网页只包含获取实际数据所需的布局和javascript,然后javascript从https://eth.ethfans.org/api/page/miner?value=2998850087633a4806191960c94ed535d97da598获取数据,因此,像javascript那样做,并获取该网址。