停止卷曲丢弃端口号

时间:2011-05-31 17:48:38

标签: php image curl port

当我卷曲以下

<?php

$ch = curl_init();
curl_setopt ($ch, CURLOPT_PORT, "8081");
curl_setopt ($ch, CURLOPT_URL, "http://192.168.0.14:8081/comingEpisodes/" );
curl_setopt($ch, CURLOPT_USERPWD, "user:pass");
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
$curl_response = curl_exec($ch);
curl_close($ch);

echo $curl_response;
?>

返回页面但图像不是。我找到了问题。 192.168.0.14是我的本地主机。我从一个应用程序调用一个页面运行端口8081.Curl似乎丢弃端口并将192.168.0.14更改为locahost,因此图像不再链接到正确的位置。如何确保端口保持不变以保留图像。感谢

编辑:我认为端口之后的/ comingEpisodes也是问题的一部分..

1 个答案:

答案 0 :(得分:3)

除非您正在构建100%代理,否则您将cURL内容转储到浏览器中。结果现在从转移cURL结果的页面引用,而不是从原始cURL请求转发。

基本上,如果您访问http://localhost并且上述代码位于index.php,则该页面正在请求:8081 / comingEpisodes内容并将其转储到上下文中原始的http://locahost。浏览器现在基于http://localhost找到的所有内容,而不是来自curl请求。

可以替换文档中的所有内容链接,然后输出到某个“proxy.php?retrieve = old_url”,然后让所有这些链接通过相同的cURL上下文调用,但这就是网络代理的基础。

End-User               Intermediary              End-Website
(http://localhost)     (localhost/index.php)     (http://192.168.0.14:8081/comingEpisodes/)
------------------     ---------------------     ------------------------------------------
Initial visit--------->
                       cURL Request------------->
                                                 Page Content (html basically)
                       Echoed back to user<------
Content<---------------
Finds <img> etc.------>
                       /comingEpisodes/img1.jpg  // 404 error, it's actually on :8081
                                                 // that localhost has no idea about
                                                 // because it's being hidden using cURL

非常简单的演示

<?php
  //
  // Very Dummied-down proxy
  //

  // Either get the url of the content they need, or use the default "page root"
  // when none is supplied. This is not robust at all, as this really only handles
  // relative urls (e.g. src="images/foo.jpg", something like src="http://foo.com/"
  // would become src="index.php?proxy=http://foo.com/" which makes the below turn
  // into "http://www.google.com/http://foo.com/")
  $_target = 'http://www.google.com/' . (isset($_GET['proxy']) ? $_GET['proxy'] : '');

  // Build the cURL request to get the page contents
  $cURL = curl_init($_target);
  try
  {
    // setup cURL to your liking
    curl_setopt($cURL, CURLOPT_RETURNTRANSFER, 1);

    // execute the request
    $page = curl_exec($cURL);

    // Forward along the content type (so images, files, etc all are understood correctly)
    $contentType = curl_getinfo($cURL, CURLINFO_CONTENT_TYPE);
    header('Content-Type: ' . $contentType);

    // close curl, we're done.
    curl_close($cURL);

    // test against the content type. If it HTML then we need to re-parse
    // the page to add our proxy intercept in the URL so the visitor keeps using
    // our cURL request above for EVEYRTHING it needs from this site.
    if (strstr($contentType,'text/html') !== false)
    {
      //
      // It's html, replace all the references to content using URLs
      //

      // First, load our DOM parser
      $html = new DOMDocument();
      $html->formatOutput = true;
      @$html->loadHTML($page); // was getting parse errors, added @ for demo purposes.

      // simple demo, look for image references and change them
      foreach ($html->getElementsByTagName('img') as $img)
      {
        // take a typical image:
        //   <img src="logo.jpg" />
        // and make it go through the proxy (so it uses cURL again:
        //   <img src="index.php?proxy=logo.jpg" />
        $img->setAttribute('src', sprintf('%s?proxy=%s', $_SERVER['PHP_SELF'], urlencode($img->getAttribute('src'))));
      }

      // finally dump it to client with the urls changed
      echo $html->saveHTML();
    }
    else
    {
      // Not HTML, just dump it.
      echo $page;
    }
  }
  // just in case, probably want to do something with this.
  catch (Exception $ex)
  {
  }