用于下载youtube视频的python脚本

时间:2015-11-03 23:35:24

标签: python youtube

在提供youtube视频网址时,我首先下载视频页面并在

之间提取javascript对象
<script>var ytplayer = ytplayer .....  </script>

我得到了

{

    "args": {
        "is_listed": "1", 
        "account_playback_token": "QUFFLUhqbWdXR1NfQjRiRmNzWVhRVTM0ajlNcnM1alVUd3xBQ3Jtc0tsVi01WFp5VmV2MTU3RnpkYUVkRzVqR1ZTNUI4T2JaQzk1ckxPejdVNkYzUk5zOTdjZnNmb1BYZHNLQ05nblZZbFk2ZWJXNHRPNVFoNVVNc2RjTE1YekdKSGY4dlVhSnlCU1ctNFZJdXBKbWhIRG1TZw==", 
        "ptk": "RajshriEntertainment", 
        "focEnabled": "1", 
        "tag_for_child_directed": false, 
        "adaptive_fmts": ......, 
        "probe_url": .....,
        "rmktEnabled": "1", 
        "allow_ratings": "1", 
        "dbp": "ChoKFk5RNTV5UGs5bDZmSk5wSjQ4a3RiSHcQARABGAI", 
        "cc3_module": "1", 
        "no_get_video_log": "1", 
        "fmt_list": ......, 
        "title":..........,
        "invideo": true, 
        "sffb": true, 
        "iurlmq_webp": , 
        "cosver": "10_8_4", 
        "url_encoded_fmt_stream_map": ................., 
        "max_dynamic_allocation_ad_tag_length": "2040", 
        "innertube_api_key": "AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8", 
        "timestamp": "1446586407", 
        "cc_asr": "1", 
        "apiary_host_firstparty": "", 
        "adsense_video_doc_id": "yt_Vd4iNPuRlx4", 
        "innertube_context_client_version": "1.20151102", 
        "mpu": true, 
        "tmi": "1", 
        "ldpj": "-19", 
        "fade_out_duration_milliseconds": "1000", 
        .........
 }
}

我发现关键 adaptive_mmts url_encoded_fmt_stream_map 包含百分比编码形式的多个网址。 我从url_encoded_fmt_stream_map中获取一个url,它看起来像这样

https://r1---sn-o3o-qxal.googlevideo.com/videoplayback?
ratebypass=yes&
signature=982E413BBE08CA5801420F9696E0F2ED691B99FA.D666D39D1A0AF066F76F12632A10D3B8076076CE&
lmt=1443906393476832&
expire=1446604919&
fexp=9406983%2C9408710%2C9414764%2C9416126%2C9417707%2C9421410%2C9422596%2C9423663&
itag=22&
dur=128.801&
source=youtube&
upn=pk2CEhVBeFM&
sver=3&
key=yt6&
id=o-AK-OlE5NUsbkp51EZY2yKuz5vsSGofgUvrvTtOrhC72e&
sparams=dur%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Cratebypass%2Crequiressl%2Csource%2Cupn%2Cexpire&
mime=video%2Fmp4&
ipbits=0&
pl=21&
ip=x.y.z.a&
initcwndbps=5405000&
requiressl=yes&
mn=sn-o3o-qxal&
mm=31&
ms=au&
mv=m&
mt=1446583222&
itag=22&
type=video/mp4

但是当我在浏览器中粘贴这个(上面)url时没有发生任何事情,我的意思是不行。 请帮帮我。

另外

包含网址的adaptive_fmts和url_encoded_fmt_stream_map有什么区别?

1 个答案:

答案 0 :(得分:0)

在python2.7中,这有效:

import urlparse, urllib2

vid        = "vzS1Vkpsi5k"
save_title = "YouTube SpaceX - Booster Number 4 - Thaicom 8 06-06-2016"
url_init   = "https://www.youtube.com/get_video_info?video_id=" + vid

resp = urllib2.urlopen(url_init, timeout=10)
data = resp.read()
info =  urlparse.parse_qs(data)
title = info['title']

print "length:  ", info['length_seconds'][0] + " seconds"

stream_map   = info['adaptive_fmts'][0]
vid_info     = stream_map.split(",")

mp4_filename = save_title + ".mp4"

for video in vid_info:
    item = urlparse.parse_qs(video)

    #print 'quality: ', item['quality'][0]
    #print 'type:    ', item['type'][0]

    url_download  = item['url'][0]
    resp          = urllib2.urlopen(url_download)

    print resp.headers

    length  = int(resp.headers['Content-Length'])
    my_file = open(mp4_filename, "w+")

    done, i = 0, 0
    buff    = resp.read(1024)        
    while buff:

        my_file.write(buff)
        done += 1024
        percent = done * 100.0 / length
        buff = resp.read(1024)

        if not i%1000:                
            percent = done * 100.0 / length
            print str(percent) + "%"

        i += 1
    break