为什么我不能卷曲这个网站? (PHP)

时间:2013-09-15 13:30:23

标签: php curl

这是我的卷曲功能,它在此站点之前运行良好:http://www.finalpazarlama.com/kategoriler

 //Curl
function curl($site){
    $ch=curl_init();
    $maxredirect = 2;
    curl_setopt($ch, CURLOPT_URL, $site);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $mr = $maxredirect === null ? 5 : intval($maxredirect);
    if (ini_get('open_basedir') == '' && ini_get('safe_mode' == 'Off')){
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, $mr > 0);
        curl_setopt($ch, CURLOPT_MAXREDIRS, $mr);
    }else{
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
        if ($mr > 0){
            $newurl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
            $rch = curl_copy_handle($ch);
            curl_setopt($rch, CURLOPT_HEADER, true);
            curl_setopt($rch, CURLOPT_NOBODY, true);
            curl_setopt($rch, CURLOPT_FORBID_REUSE, false);
            curl_setopt($rch, CURLOPT_RETURNTRANSFER, true);
            do {
                curl_setopt($rch, CURLOPT_URL, $newurl);
                $header = curl_exec($rch);
                if (curl_errno($rch)){
                    $code = 0;
                }else{
                    $code = curl_getinfo($rch, CURLINFO_HTTP_CODE);
                    if ($code == 301 || $code == 302){
                        preg_match('/Location:(.*?)\n/', $header, $matches);
                        $newurl = trim(array_pop($matches));
                    }else{
                        $code = 0;
                    }
                }
            }
            while ($code && --$mr);
            curl_close($rch);
            if (!$mr){
                if ($maxredirect === null){
                    trigger_error('Too many redirects. When following redirects, libcurl hit the maximum amount.',E_USER_WARNING);
                }else{
                    $maxredirect = 0;
                }
                return false;
            }
            curl_setopt($ch, CURLOPT_URL, $newurl);
        }
    }
    return curl_exec($ch);
}

当我尝试使用http://www.finalpazarlama.com/kategoriler时,它会返回空。 可能是什么问题呢?为什么我不能得到它?

1 个答案:

答案 0 :(得分:0)

找到HTTP / 1.1 302 缓存控制:私有 内容长度:157 内容类型:text / html;字符集= utf-8的 位置:/ PageNotFound?aspxerrorpath = / kategoriler 服务器:Microsoft-IIS / 7.5 X-AspNetMvc-Version:4.0 X-AspNet-Version:4.0.30319 X-Powered-By:ASP.NET X-Powered-By-Plesk:PleskWin 日期:太阳,2013年9月15日14:25:06 GMT

那是响应,基本上你是卷曲到卷曲不接受的相对URL,即/ PageNotFound?aspxerrorpath = / kategoriler

您需要将网址构建为绝对网址 我已经有了这个功能:https://code.google.com/p/add-mvc-framework/source/browse/project/trunk/functions/url.functions.php

随意复制这两个以最终结果:

<?php
//Curl
function curl($site){
    $ch=curl_init();
    $maxredirect = 2;
    curl_setopt($ch, CURLOPT_URL, $site);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $mr = $maxredirect === null ? 5 : intval($maxredirect);
    if (ini_get('open_basedir') == '' && ini_get('safe_mode' == 'Off')){
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, $mr > 0);
        curl_setopt($ch, CURLOPT_MAXREDIRS, $mr);
    }else{
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
        if ($mr > 0){
            $newurl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
            $rch = curl_copy_handle($ch);
            curl_setopt($rch, CURLOPT_HEADER, true);
            curl_setopt($rch, CURLOPT_NOBODY, true);
            curl_setopt($rch, CURLOPT_FORBID_REUSE, false);
            curl_setopt($rch, CURLOPT_RETURNTRANSFER, true);
            do {
                curl_setopt($rch, CURLOPT_URL, $newurl);
                $header = curl_exec($rch);
                if (curl_errno($rch)){
                    $code = 0;
                }else{
                    $code = curl_getinfo($rch, CURLINFO_HTTP_CODE);
                    if ($code == 301 || $code == 302){
                        preg_match('/Location:(.*?)\n/', $header, $matches);
                        $newurl = trim(array_pop($matches));
                        $newurl = build_url($site,$newurl);
                    }else{
                        $code = 0;
                    }
                }
            }
            while ($code && --$mr);
            curl_close($rch);
            if (!$mr){
                if ($maxredirect === null){
                    trigger_error('Too many redirects. When following redirects, libcurl hit the maximum amount.',E_USER_WARNING);
                }else{
                    $maxredirect = 0;
                }
                return false;
            }
            curl_setopt($ch, CURLOPT_URL, $newurl);
        }
    }
    return curl_exec($ch);
}

/**
 * URL functions
 *
 * @package ADD MVC\Functions
 *
 */

/**
 * Returns the complete url according to $base
 *
 * @param string $base
 * @param string $url
 *
 * @since ADD MVC 0.5
 *
 * @version 0.1
 */
function build_url($base,$url) {
   $base_parts=url_parts($base);

   # https://code.google.com/p/add-mvc-framework/issues/detail?id=81
   if (preg_match('/^(javascript|data)\:/',$url)) {
      return $url;
   }

   if ($url[0]==='/') {
      return rtrim($base_parts['protocol_domain'],'/').$url;
   }
   if ($url[0]==='?') {
      if (!$base_parts['pathname'])
         $base_parts['pathname']='/';
      return $base_parts['protocol_domain'].$base_parts['pathname'].$url;
   }
   if ($url[0]==='#') {

   }
   if (preg_match('/^https?\:\/+/',$url)) {
      return $url;
   }

   return rtrim($base_parts['protocol_domain'],"/").$base_parts['path'].$url;
}

/**
 * Returns the URL parts of the url
 *
 * @param string $url
 *
 * @since ADD MVC 0.5
 */
function url_parts($url) {
   if (!preg_match('/^(?P<protocol_domain>(?P<protocol>https?\:\/+)(?P<domain>([^\/\W]|[\.\-])+))(?P<request_uri>(?P<pathname>(?P<path>\/(.+\/)?)?(?P<file>[^\?\#]+?)?)?(?P<query_string>\?[^\#]*)?)(\#(?P<hash>.*))?$/',$url,$url_parts)) {
      echo debug_backtrace();
      throw new Exception("Invalid url: $url");
   }
   return $url_parts;
}
echo "<xmp>";
var_dump(curl('http://www.finalpazarlama.com/kategoriler'));