无法通过HTTP下载文件-会话已挂起

时间:2019-05-02 20:29:36

标签: python http get

一个朋友告诉我,当他进入以下网站时:
http://tatoochange.com/watch/OGgmlnav-joe-gould-s-secret/vip.html

他注意到在播放视频时,在Fiddler上他看到了文件(http://85.217.223.24/vids/joe_goulds_secret_2000.mp4)的路径:
enter image description here

因此他尝试从浏览器下载它,但收到错误消息:
enter image description here

我在播放视频时向Burpe检查了GET请求:

GET /vids/joe_goulds_secret_2000.mp4 HTTP/1.1
Host: 85.217.223.24
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0
Accept: video/webm,video/ogg,video/*;q=0.9,application/ogg;q=0.7,audio/*;q=0.6,*/*;q=0.5
Accept-Language: en-US,en;q=0.5
Referer: http://entervideo.net/watch/3accec760b23ad4
Range: bytes=0-
Connection: close

我将其转换为python脚本:

import requests

session = requests.Session()

headers = {"Accept":"video/webm,video/ogg,video/*;q=0.9,application/ogg;q=0.7,audio/*;q=0.6,*/*;q=0.5","User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0","Referer":"http://entervideo.net/watch/3accec760b23ad4","Connection":"close","Accept-Language":"en-US,en;q=0.5","Range":"bytes=0-"}
response = session.get("http://85.217.223.24/vids/joe_goulds_secret_2000.mp4", headers=headers)

print("Status code:   %i" % response.status_code)
print("Response body: %s" % response.content)

当我运行它时,它挂了。
我不知道是否下载它。

我的问题是,为什么不能仅通过访问就无法从浏览器下载它?
其次,即使我使用的脚本没有任何错误,它仍然挂起...

2 个答案:

答案 0 :(得分:1)

不建议使用sessions.get下载大文件。这主要用于接收json或xml列表的网络呼叫。要下载大文件,您应该遵循此线程中显示的方法:

Download large file in python with requests

答案 1 :(得分:0)

我设法做到了。
需要注意的是,响应为206,这是部分内容。

解决方案:

import os,requests
def download():
    headers = {"Accept": "video/webm,video/ogg,video/*;q=0.9,application/ogg;q=0.7,audio/*;q=0.6,*/*;q=0.5",
               "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0",
               "Referer": "http://entervideo.net/watch/3accec760b23ad4", "Connection": "close",
               "Accept-Language": "en-US,en;q=0.5", "Range": "bytes=0-"}
    get_response = requests.get("http://85.217.223.24/vids/joe_goulds_secret_2000.mp4", headers=headers,stream=True)
    #file_name  = url.split("/")[-1]
    file_name = r'c:\tmp\joe_goulds_secret_2000.mp4'
    with open(file_name, 'wb') as f:
        count = 0
        for chunk in get_response.iter_content(chunk_size=1024):
            print('chunk: ' + str(count))
            count += 1
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)

download()