PHP readfile()URL问题

时间:2014-01-21 23:55:19

标签: php

我有这样的网址:

http://r16---sn-4g57kn6e.googlevideo.com/videoplayback?&quality=medium&signature=797C0FEB1961E6226294D5FC19BC0CD28657975C.1E745D852200D14B706F0EBF9EA8762680374564&itag=43&mv=m&ip=84.19.165.220&ipbits=0&ms=au&ratebypass=yes&source=youtube&mt=1390347607&id=8b92b07ff9cd9862&key=yt5&fexp=942502,916626,929305,936112,924616,936910,936913,907231,921090&upn=cMPazwtmyZU&sver=3&sparams=id,ip,ipbits,itag,ratebypass,source,upn,expire&expire=1390371882&type=video%2Fwebm%3B+codecs%3D%22vp8.0%2C+vorbis%22&fallback_host=tc.v12.cache5.googlevideo.com&title=Requiem+For+A+Dream+Original+Song&title=Requiem For A Dream Original Song

问题是readfile()函数会产生特殊字符的错误原因(错误请求)。

如果我使用urlencode(),它会更加破坏网址。

我该如何处理?

2 个答案:

答案 0 :(得分:1)

答案 1 :(得分:0)

结合我原来的答案中的内容,加上Brad的答案中的一些想法,我提供以下解决方案

<?php
$url='http://r16---sn-4g57kn6e.googlevideo.com/videoplayback?&quality=medium&signature=797C0FEB1961E6226294D5FC19BC0CD28657975C.1E745D852200D14B706F0EBF9EA8762680374564&itag=43&mv=m&ip=84.19.165.220&ipbits=0&ms=au&ratebypass=yes&source=youtube&mt=1390347607&id=8b92b07ff9cd9862&key=yt5&fexp=942502,916626,929305,936112,924616,936910,936913,907231,921090&upn=cMPazwtmyZU&sver=3&sparams=id,ip,ipbits,itag,ratebypass,source,upn,expire&expire=1390371882&type=video%2Fwebm%3B+codecs%3D%22vp8.0%2C+vorbis%22&fallback_host=tc.v12.cache5.googlevideo.com&title=Requiem+For+A+Dream+Original+Song&title=Requiem For A Dream Original Song';
$cleanUrl = parseQuery($url);
$data = getData($cleanUrl);
echo "file read in OK\n";

function parseQuery($url) {
  preg_match('/(https?:\/\/[^?]+\?)(.*)$/', $url, $rawQuery);
  preg_match_all('/([^=]+)=([^&]+)&/', $rawQuery[2], $queries);
  $qArray = array_combine($queries[1], $queries[2]);
  $newUrl = $rawQuery[1] . http_build_query($qArray);
  return $newUrl;
}

function getData($url) {
$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";

$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_USERAGENT, $useragent);
$data = curl_exec($curl);
curl_close($curl);
return $data;
}
?>

这主要涉及以下步骤:

  1. 获取初始网址,然后将其拆分为“?之前的内容,以及之后的内容”
  2. “之前的东西”没有受到影响; “后面的东西”被分成两个数组 - 查询的键和值(“一切最多=”和“一切最多&”)
  3. 然后使用(来自Brad的答案)http_build_query数组
  4. 将这两个数组合并为一个有效的查询字符串
  5. 最后,我使用curl来获取文件(因为我知道它比readfile()更好。)
  6. 它似乎对我有用。如果它不适合你,请告诉我......