如何获取源代码或远程服务器?

时间:2015-01-15 00:33:31

标签: php html

我想要远程网站的源代码。所以我用过:

<?php
include_once('simple_html_dom.php');
$f = file_get_contents("http://163.53.77.55");
echo htmlspecialchars( $f ); 

我得到了源代码......但现在我想要源代码:

$f = file_get_contents("http://163.53.77.55/offers/");

我收到了这个错误:

  

警告:file_get_contents(http://163.53.77.55/offers):无法打开流:HTTP请求失败!

中的HTTP / 1.1 500服务器错误

这意味着我可以看到stackoverflow.com的源代码,但无法看到stackoverflow.com/questions /!

1 个答案:

答案 0 :(得分:1)

你必须使用卷曲。但首先关闭JavaScript,看看你需要的信息是否存在。例如,商品页面通过JavaScript获取图片。

本页的设计者试图劝阻你。

当您使用curl时,请使用旧的智能手机用户代理。

这有效:

$request = array();
$request[] = "Host: www.flipkart.com";
$request[] = "Connection: keep-alive";
$request[] = "Cache-Control: no-cache";
$request[] = "Pragma: no-cache";
$request[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$request[] = "User-Agent: MOT-V9mm/00.62 UP.Browser/6.2.3.4.c.1.123 (GUI) MMP/2.0";
$request[] = "Accept-Language: en-US,en;q=0.5";

$ch = curl_init('http://www.flipkart.com/offers/');
curl_setopt($ch, CURLOPT_ENCODING,"");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_FILETIME, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 100);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_TIMEOUT,100);
curl_setopt($ch, CURLOPT_FAILONERROR,true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $request);
$data = curl_exec($ch);
if (curl_errno($ch)){
    $data .= 'Retreive Base Page Error: ' . curl_error($ch);
}
else {
  $skip = intval(curl_getinfo($ch, CURLINFO_HEADER_SIZE)); 
  $head = substr($data,0,$skip);
  $data = substr($data,$skip);
 }
echo $data;