仅使用python请求打开URL,然后继续下载文件

时间:2019-02-14 07:52:43

标签: python python-requests

我正在尝试使用Python的请求模块从网站下载文件。

但是,仅当直接从下载页面单击下载链接时,该站点才允许我下载文件。 因此,使用请求,我尝试首先使用 Map<String,Map<String,Set<String>> clusters = new HashMap<.........>; Map<String,Set<String>> servers = clusters.get(clusterkey); if(servers==null){ synchronized(clusterkey){ servers = clusters.get(clusterkey); if(servers==null){....initialize new hashmap and put...} } } Set<String> users=servers.get(serverkey); if(users==null){ synchronized(serverkey){ users=servers.get(serverkey); if(users==null){ ... initialize new hashset and put...} } } users.add(userid); 来访问下载页面的URL,然后继续下载文件。但不幸的是,这似乎不起作用。首先要求我打开下载页面的文本被简单地写入file.torrent”

requests.get()

直接从下载页面下载时的响应(有效):

import requests

def download(username, password):
    with requests.Session() as session:
        session.post('https://website.net/forum/login.php', data={'login_username': username, 'login_password': password})

        # Download page URL
        requests.get('https://website.net/forum/viewtopic.php?t=2508126')

        # The download URL itself
        response = requests.get('https://website.net/forum/dl.php?t=2508126')

        with open('file.torrent', 'wb') as f:
            f.write(response.content)

download(username='XXXXX', password='YYYYY')

单独打开下载链接时响应(不起作用):

General :
Request URL: https://website.net/forum/dl.php?t=2508126
Request Method: GET
Status Code: 200 OK
Remote Address: 185.37.128.136:443
Referrer Policy: no-referrer-when-downgrade

Response Headers :
Cache-Control: no-store, no-cache, must-revalidate
Cache-Control: post-check=0, pre-check=0
Content-Disposition: attachment; filename="[website.net].t2508126.torrent"
Content-Length: 33641
Content-Type: application/x-bittorrent; name="[website.net].t2508126.torrent"
Date: Thu, 14 Feb 2019 07:57:08 GMT
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Last-Modified: Thu, 14 Feb 2019 07:57:09 GMT
Pragma: no-cache
Server: nginx
Set-Cookie: bb_dl=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/forum/; domain=.website.net

Request Headers :
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: keep-alive
Cookie: bb_t=a%3A3%3A%7Bi%3A2507902%3Bi%3A1550052944%3Bi%3A2508011%3Bi%3A1550120230%3Bi%3A2508126%3Bi%3A1550125516%3B%7D; bb_data=1-27969311-wXVPJGcedLE1I2mM9H0u-3106784170-1550128652-1550131012-3061288864-1; bb_dl=2508126
Host: website.net
Referer: https://website.net/forum/viewtopic.php?t=2508126
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3701.0 Safari/537.36

Query String Parameters :
t: 2508126

1 个答案:

答案 0 :(得分:0)

这对我有用:

data={'login_username': username, 'login_password': password, 'login': ''}

并使用 session.get() 而不是 requests.get()