我使用了一个简单的file_get_contents函数但是没有得到它的实际内容(输出)..
我无法弄清楚错误!!!
代码:
<?php
// $url = $_GET['url'];
// $flv_http_path = urlencode($url);
$flv_http_path = 'http://r12.bhartibb-maa1.c.youtube.com/videoplayback?ip=0.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor%2Coc%3AU0dXSlBSUl9FSkNNN19ITFZB&algorithm=throttle-factor&itag=34&ipbits=0&burst=40&sver=3&expire=1285074000&key=yt1&signature=3E1E4994130745C392FA479F6ACCE5F40E703A2C.A87325A1DCB178B04FD89A9DEEE811CDCB08157C&factor=1.25&id=8b2fd4fd9ac2f09f&st=lc';
echo "----$flv_http_path------";
$data = file_get_contents($flv_http_path);
echo "$data";
if($data)
echo "data is avail";
else
echo "data not available";
// $new_flv_path = dirname(_FILE_).'/flvs/sample.flv' ;
$new_flv_path = '/home/public_html/temp/sample.flv' ;
if(file_put_contents($new_flv_path, $data))
return $new_flv_path ;
else
{
echo "else part ";
return false;
}
?>
我从youtube视频的响应标题中获取了该网址
和我得到的标题是
http://v3.lscache1.c.youtube.com/videoplayback?ip=0.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor%2Coc%3AU0dXSlBTVl9FSkNNN19ITVpF&algorithm=throttle-factor&itag=34&ipbits=0&burst=40&sver=3&expire=1285088400&key=yt1&signature=536A81F10AA43A4E015BB05FA182A9A966047C3C.C22269E2E1ECFC2C2DE7A8A45BA2C3DF7CF1EC08&factor=1.25&id=fd61d32bbbd1be5e&
GET /videoplayback?ip=0.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor%2Coc%3AU0dXSlBTVl9FSkNNN19ITVpF&algorithm=throttle-factor&itag=34&ipbits=0&burst=40&sver=3&expire=1285088400&key=yt1&signature=536A81F10AA43A4E015BB05FA182A9A966047C3C.C22269E2E1ECFC2C2DE7A8A45BA2C3DF7CF1EC08&factor=1.25&id=fd61d32bbbd1be5e& HTTP/1.1
Host: v3.lscache1.c.youtube.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1) Gecko/20090616 Firefox/3.5
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: VISITOR_INFO1_LIVE=9CH-GUrsSEQ; __utma=27069237.1455305642.1275034254.1279868001.1280568792.6; __utmz=27069237.1279868001.5.2.utmcsr=google.com|utmccn=(referral)|utmcmd=referral|utmcct=/landing/youtube/lifeinaday/; watched_video_id_list_kvijayhari=7b1d7ce3852b9aca07a985813b83aaa6WxMAAABzCwAAADFuNzRnSExwU0M4cwsAAAB2ajgxNXlQNDFMQXMLAAAARWNjZ0lLdHVDM1lzCwAAAHFHZFo5elhoQ0ZvcwsAAAB0WXMwTXhvbTRjSXMLAAAAYUdBdDZwNGh0c2NzCwAAAGR2V25wMjdBSGZvcwsAAABtNDBhbG1SQzNzSXMLAAAANjhVT1BhTUtwOTBzCwAAADZnaFUxWDBqdVM4cwsAAABiRy0xYTRsUnlEMHMLAAAAWjh5OFFDRFNUQ29zCwAAADY0T0w3NzhBeUlFcwsAAABzQkl1OWpnSWtwQXMLAAAASllYM08wWEEteWdzCwAAAF95WGxpc0g4dkF3cwsAAABzcXZCSXdDMWxtWXMLAAAAaEMzd09EU0U5MHdzCwAAAGZaODhxaHduTVow; auto_translation=b901c47ed36700682e23d64062529856cwQAAAB0cnVl; PREF=f1=50000000&f2=2000&emt=iceberg&ftuc=32&ems=hd720&HIDDEN_MASTHEAD_ID=brO_JIa6RTI; use_hitbox=72c46ff6cbcdb7c5585c36411b6b334edAEAAAAw; GEO=489e10e70a42c0dfed7513e1895ffe1bcwsAAAAzSU56spxTTJhEAw==; watched_video_id_list=2aa4a241cbdc35137f13b3513ea3e653WwQAAABzCwAAAF9XSFRLN3ZSdmw0cwsAAABpeV9VX1pyQzhKOHMLAAAAd3ZsTUFKLVU2SEVzCwAAAENaQmpoVGQ0WjlN
HTTP/1.0 200 OK
Last-Modified: Sun, 20 Jun 2010 03:59:10 GMT
Content-Type: video/x-flv
Date: Tue, 21 Sep 2010 10:05:34 GMT
Expires: Tue, 21 Sep 2010 16:55:00 GMT
Cache-Control: public, max-age=24566
Content-Length: 4077907
Accept-Ranges: bytes
X-Content-Type-Options: nosniff
Server: gvs 1.0
X-Cache: MISS from localhost.localdomain
X-Cache-Lookup: MISS from localhost.localdomain:3128
Via: 1.0 localhost.localdomain:3128 (squid/2.6.STABLE6)
Connection: keep-alive
答案 0 :(得分:1)
检查您的网址。
当我将你的网址放入浏览器时,它没有提供任何内容,因此file_get_contents
返回一个空字符串。
您需要将file_get_contents
的输出检查为:
if($data !== false)
而不是
if($data)
答案 1 :(得分:1)
我还获得了HTTP响应500.为了抓取Youtube,您可能必须欺骗呼叫的User-Agent和其他措施,以防止Youtube将您识别为爬虫。
答案 2 :(得分:0)
我在以下位置获得了HTTP 403
:
回复标题:
内容类型:文本/无格式
日期:2010年9月21日星期二09:59:13 GMT
代理连接:靠近
服务器:gvs 1.0
Via:1.0 proxy3@XXXXX.sch.uk:8080(squid/2.6.STABLE19)1.0 wcsproxy.XXXX.org.uk:8080(squid/2.6.STABLE19)
X-Cache:来自proxy3@XXX.sch.uk的MISS,来自wcsproxy.XXX.org.uk的MISS
X-的Content-Type-选项:nosniff
答案 3 :(得分:0)
好吧,当我尝试加载您在$flv_http_path
中引用的网址时,我得到了:
HTTP/1.1 403 Forbidden
Content-Type: text/plain
Connection: close
X-Content-Type-Options: nosniff
Date: Tue, 21 Sep 2010 09:57:19 GMT
Server: gvs 1.0
作为回报。
那应该给你一个线索:)
如果那不是你试图打开的实际文件,而你实际上并没有试图刮掉youtube,你应该尝试在urlencode中包装url()编辑:但是网址已经是urlencoded(呃!)
“如果要打开带有特殊字符的URI,例如空格,则需要使用urlencode()对URI进行编码。” - http://www.php.net/manual/en/function.file-get-contents.php
答案 4 :(得分:0)
链接为空。在浏览器中触发链接并检查源代码。没有数据。
答案 5 :(得分:0)
这是阻止你自动抓取他们的flv文件的方法。
您无法从服务器获取该文件,因为下载链接(您从浏览器获得的链接,或者您是如何找到flv链接的)已锁定在您的浏览器中。
这就是为什么当你以外的某个人试图调用该链接时,即使使用了欺骗性的用户代理,我们也都禁止使用403 HTTP。
尝试使用cURL并显示标题,你会看到我的意思。