我有下一个问题。我需要获得几页的HTML。所有这些都适用于PHP函数file()或file_get_contents()或CURL。
但是对于一个URL不起作用!! Here it is(当然,我尝试获取非缩短网址的HTML)。
我尝试了所有,没有任何帮助。我可以在浏览器中打开这个页面,它返回200状态,但是......我无法得到它的内容!当我尝试通过CURL获取它时,它会返回 500错误:
Stack Trace:
[NullReferenceException: Object reference not set to an instance of an object.]
ASP.ypDetectClass..ctor() +47
ASP.immigration_immigrating_ainp_application_forms_aspx..ctor() +26
__ASP.FastObjectFactory_app_web_obqstzij.Create_ASP_immigration_immigrating_ainp_application_forms_aspx() +20
System.Web.Compilation.BuildResultCompiledType.CreateInstance() +32
System.Web.Compilation.BuildManager.CreateInstanceFromVirtualPath(VirtualPath virtualPath, Type requiredBaseType, HttpContext context, Boolean allowCrossApp, Boolean noAssert) +119
System.Web.UI.PageHandlerFactory.GetHandlerHelper(HttpContext context, String requestType, VirtualPath virtualPath, String physicalPath) +33
System.Web.UI.PageHandlerFactory.System.Web.IHttpHandlerFactory2.GetHandler(HttpContext context, String requestType, VirtualPath virtualPath, String physicalPath) +40
System.Web.HttpApplication.MapHttpHandler(HttpContext context, String requestType, VirtualPath path, String pathTranslated, Boolean useAppConfig) +160
System.Web.MapHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() +93
System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) +155
Version Information: Microsoft .NET Framework Version:2.0.50727.3623; ASP.NET Version:2.0.50727.3618 "
答案 0 :(得分:4)
您必须在HTTP请求中发送User-Agent
HTTP标头。
使用cURL,您可以设置CURLOPT_USERAGENT
选项。这有效:
$ch = curl_init();
curl_setopt( $ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; U; Linux i686; pt-BR; rv:1.9.2.18) Gecko/20110628 Ubuntu/10.04 (lucid) Firefox/3.6.18' );
curl_setopt( $ch, CURLOPT_URL, 'http://albertacanada.com/immigration/immigrating/ainp-application-forms.aspx' );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
$result = curl_exec ( $ch );
curl_close ( $ch );
echo $result;
检查http://php.net/manual/en/function.curl-setopt.php,此用户也提供了注释:http://www.php.net/manual/en/function.curl-setopt.php#10692
答案 1 :(得分:0)
我可以使用命令行curl
检索页面的内容。因此,您很可能需要在脚本中设置用户代理。
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");