在PHP中使用cURL时如何避免重定向

时间:2018-07-08 20:14:27

标签: php curl google-apps-script

我想从PHP调用我的Google App Script Web服务,并且我正尝试为此使用cURL,并且将GET方法用于初学者。该服务以简单的文本响应。我发现可以执行GET请求的PHP函数(见下文)。

App Script服务的第一个响应始终是重定向,因此我在fetching函数的配置数组中关闭了以下重定向,读取了URL并执行了另一个请求。但是,在第二个请求url上,我被重定向到服务url(由于服务而显示文本 yes ),这是不希望的。我想以编程方式处理响应,而不会让cURL影响浏览器中的输出或行为。

初始状态及其工作方式:

  1. 我有一个正常运行的App Script Web服务URL。
  2. 该Web服务返回文本
  3. 我有一个调用Web服务的PHP脚本。
  4. 调用后,PHP脚本回显服务返回的内容。
  5. 显示服务返回的内容后,PHP脚本回显 hello 一词。

在浏览器中打开PHP脚本时实际发生的情况:

  1. 我执行第一个GET请求并获得响应。
  2. 我从响应中获取 redirect_url ,并对它执行第二个GET请求。
  3. 单词 hello 会第二秒钟显示在其中,并与PHP脚本的URL保持一致。
  4. 浏览器中的
  5. URL更改为服务的URL, hello 消失,并且服务响应 yes 出现(可能刚刚重定向)。

我使用的PHP脚本:

$response = get_web_page($url);

$url2 = $response["redirect_url"];

$response = get_web_page($url2);

echo $response["content"];

echo "hello";

我用来获取URL的PHP​​函数:

function get_web_page( $url, $cookiesIn = '' ){
        $options = array(
            CURLOPT_RETURNTRANSFER => true,     // return web page
            CURLOPT_HEADER         => true,     //return headers in addition to content
            CURLOPT_FOLLOWLOCATION => false,     // follow redirects
            CURLOPT_ENCODING       => "",       // handle all encodings
            CURLOPT_AUTOREFERER    => false,     // set referer on redirect
            CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect
            CURLOPT_TIMEOUT        => 120,      // timeout on response
            CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects
            CURLINFO_HEADER_OUT    => true,
            CURLOPT_SSL_VERIFYPEER => true,     // Validate SSL Cert
            CURLOPT_HTTP_VERSION   => CURL_HTTP_VERSION_1_1,
            CURLOPT_COOKIE         => $cookiesIn
        );

        $ch      = curl_init( $url );
        curl_setopt_array( $ch, $options );
        $rough_content = curl_exec( $ch );
        $err     = curl_errno( $ch );
        $errmsg  = curl_error( $ch );
        $header  = curl_getinfo( $ch );
        curl_close( $ch );

        $header_content = substr($rough_content, 0, $header['header_size']);
        $body_content = trim(str_replace($header_content, '', $rough_content));
        $pattern = "#Set-Cookie:\\s+(?<cookie>[^=]+=[^;]+)#m"; 
        preg_match_all($pattern, $header_content, $matches); 
        $cookiesOut = implode("; ", $matches['cookie']);

        $header['errno']   = $err;
        $header['errmsg']  = $errmsg;
        $header['headers']  = $header_content;
        $header['content'] = $body_content;
        $header['cookies'] = $cookiesOut;
    return $header;
}

当我尝试打印第二个请求的整个响应时:

内容显然是空的。

Array ( [url] => https://www.google.com/a/MY_DOMAIN_NAME/ServiceLogin?service=wise&passive=1209600&continue=https://script.google.com/a/MY_DOMAIN_NAME/macros/s/AKfycbwg9-Y10kLyIJxGLYhpx9BnIu5f8AHs4qKEq1rSUoD-ugjsD3c/exec&followup=https://script.google.com/a/MY_DOMAIN_NAME/macros/s/AKfycbwg9-Y10kLyIJxGLYhpx9BnIu5f8AHs4qKEq1rSUoD-ugjsD3c/exec [content_type] => text/html; charset=UTF-8 [http_code] => 200 [header_size] => 511 [request_size] => 379 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 0.146943 [namelookup_time] => 0.001001 [connect_time] => 0.026204 [pretransfer_time] => 0.102405 [size_upload] => 0 [size_download] => 1839 [speed_download] => 12515 [speed_upload] => 0 [download_content_length] => 1839 [upload_content_length] => 0 [starttransfer_time] => 0.146158 [redirect_time] => 0 [redirect_url] => [primary_ip] => 172.217.16.68 [certinfo] => Array ( ) [request_header] => GET /a/MY_DOMAIN_NAME/ServiceLogin?service=wise&passive=1209600&continue=https://script.google.com/a/MY_DOMAIN_NAME/macros/s/AKfycbwg9-Y10kLyIJxGLYhpx9BnIu5f8AHs4qKEq1rSUoD-ugjsD3c/exec&followup=https://script.google.com/a/MY_DOMAIN_NAME/macros/s/AKfycbwg9-Y10kLyIJxGLYhpx9BnIu5f8AHs4qKEq1rSUoD-ugjsD3c/exec HTTP/1.1 Host: www.google.com Accept: */* Accept-Encoding: deflate, gzip Cookie: [errno] => 0 [errmsg] => [headers] => HTTP/1.1 200 OK Content-Type: text/html; charset=UTF-8 X-Frame-Options: DENY Cache-control: no-cache, no-store Pragma: no-cache Expires: Mon, 01-Jan-1990 00:00:00 GMT Date: Sun, 08 Jul 2018 20:17:24 GMT X-Content-Type-Options: nosniff X-XSS-Protection: 1; mode=block Content-Length: 1839 Server: GSE Set-Cookie: GAPS=1:qc-OvEMCdZAhgMZ6MhkFoUwlPrOIGg:_W0RGhF378FPmC1G;Path=/a;Expires=Tue, 07-Jul-2020 20:17:24 GMT;Secure;HttpOnly;Priority=HIGH Alt-Svc: quic=":443"; ma=2592000; v="43,42,41,39,35" [content] =>
[cookies] => GAPS=1:qc-OvEMCdZAhgMZ6MhkFoUwlPrOIGg:_W0RGhF378FPmC1G )

编辑:

事实证明,该服务已设置为仅供我使用。因此,事件链现在变得更加清晰:我访问了该服务,Google要求进行身份验证,并且由于我以所需用户的身份登录浏览器,因此我立即被重定向到该服务。注销后在浏览器中运行PHP脚本后,我得到了Google登录页面。

我现在已经将该服务部署给任何人都可以访问,并且无需重新定向并从该服务获取内容就可以访问该服务。重定向行为的区别似乎是/a/druna.cz/。没有它,并且当仍仅部署给我访问时,我无需重新定向即可获取内容为html的登录页面。有了它,总是会导致重定向。实际网址是https://script.google.com/a/druna.cz/macros/s/AKfycbzo6Y_XiXHLZFuNEb2rB7GLVXbXhtBpCGo9AlL8ul-gITmvv6k/exec

因此,即使设置得更好,也可以避免重定向,但问题是-在需要身份验证(存在/a/druna.cz/的情况下,如何强制重定向,并且有一种避免重定向的方法(并且很简单) PHP脚本失败)?

2 个答案:

答案 0 :(得分:1)

我使用第一个URL https://script.google.com/a/druna.cz/macros/s/AKfycbzo6Y_XiXHLZFuNEb2rB7GLVXbXhtBpCGo9AlL8ul-gITmvv6k/exec

在命令行上检查了您的代码

它打印了很多HTML,还打印了“你好”。在HTML中,有一条重要的语句“ window.location.replace(redirectUrl);”。由于通过浏览器访问该HTML,因此页面重定向。当您登录浏览器窗口时,由于类似的JS代码,它可能会重定向到其他URL。

<!DOCTYPE html>
<html lang="en">
  <head>
  <meta charset="utf-8">
  <meta name="robots" content="noindex">
  <title>Sign in - Google Accounts</title>
  <meta http-equiv="refresh" content="1; url=https://www.google.com/accounts/AccountChooser?hd=druna.cz&amp;continue=https%3A%2F%2Fscript.google.com%2Fa%2Fdruna.cz%2Fmacros%2Fs%2FAKfycbzo6Y_XiXHLZFuNEb2rB7GLVXbXhtBpCGo9AlL8ul-gITmvv6k%2Fexec&amp;followup=https%3A%2F%2Fscript.google.com%2Fa%2Fdruna.cz%2Fmacros%2Fs%2FAKfycbzo6Y_XiXHLZFuNEb2rB7GLVXbXhtBpCGo9AlL8ul-gITmvv6k%2Fexec&amp;service=wise"></meta>
  </head>
  <body >
  <form id="hiddenget" action="https://www.google.com/accounts/AccountChooser?hd=druna.cz&amp;continue=https%3A%2F%2Fscript.google.com%2Fa%2Fdruna.cz%2Fmacros%2Fs%2FAKfycbzo6Y_XiXHLZFuNEb2rB7GLVXbXhtBpCGo9AlL8ul-gITmvv6k%2Fexec&amp;followup=https%3A%2F%2Fscript.google.com%2Fa%2Fdruna.cz%2Fmacros%2Fs%2FAKfycbzo6Y_XiXHLZFuNEb2rB7GLVXbXhtBpCGo9AlL8ul-gITmvv6k%2Fexec&amp;service=wise" method="get">
  <noscript>
  You should turn on Javascript support.
  <input type="submit" id="nojssubmit" value="Continue">
  </noscript>
</form>
  <script nonce="HobMA9FpZpT2/2X7NDFffIUgemA">
window.onload = function() {
  var redirectUrl = 'https:\x2F\x2Fwww.google.com\x2Faccounts\x2FAccountChooser?hd=druna.cz\x26continue=https%3A%2F%2Fscript.google.com%2Fa%2Fdruna.cz%2Fmacros%2Fs%2FAKfycbzo6Y_XiXHLZFuNEb2rB7GLVXbXhtBpCGo9AlL8ul-gITmvv6k%2Fexec\x26followup=https%3A%2F%2Fscript.google.com%2Fa%2Fdruna.cz%2Fmacros%2Fs%2FAKfycbzo6Y_XiXHLZFuNEb2rB7GLVXbXhtBpCGo9AlL8ul-gITmvv6k%2Fexec\x26service=wise';
  var domain = 'druna.cz';
  var hash = window.location.hash;
  if (hash) {
  var match = hash.match(/[#&]Email=([^&]+)/);
  if (match) {
  redirectUrl += "&Email=" + match[1] + "@" + domain;
  }
  }
  window.location.replace(redirectUrl);
};
</script>
  </body>
</html>hello

答案 1 :(得分:1)

浏览器由于打印了javascript或html标签而在一秒钟后重定向

echo $response["content"];

这里有两个选项,您可以对响应进行HTML编码,以便html被浏览器呈现为文本,

echo htmlentities ( $response["content"], ENT_QUOTES | ENT_HTML401 | ENT_SUBSTITUTE | ENT_DISALLOWED, 'UTF-8', true );

或者,如果您不想直接显示HTML,而只是显示html的文本内容,则可以提取文本:

$domd=@DOMDocument::loadHTML($response["content"]);
echo htmlentities ( $domd->textContent, ENT_QUOTES | ENT_HTML401 | ENT_SUBSTITUTE | ENT_DISALLOWED, 'UTF-8', true );

或ofc,如果您只想显示响应的特定部分,则可以解析出该部分并仅打印该部分。