Python3,通过按钮点击从URL下载文件

时间:2018-01-13 10:46:39

标签: python parsing web-scraping download urllib

我需要从此https://freemidi.org/getter-13560

这样的链接下载文件

但我无法使用urllib.requestrequests库,因为它会下载html,而不是midi。有没有解决方案?此处还有按钮本身的链接link

1 个答案:

答案 0 :(得分:1)

通过添加正确的标头并使用会话,我们可以使用请求模块下载并保存文件。

import requests

headers = {
            "Host": "freemidi.org",
            "Connection": "keep-alive",
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
            "Accept-Encoding": "gzip, deflate, br",
            "Accept-Language": "en-US,en;q=0.9",
           }

session = requests.Session()

#the website sets the cookies first
req1 = session.get("https://freemidi.org/getter-13560", headers = headers)

#Request again to download
req2 = session.get("https://freemidi.org/getter-13560", headers = headers)
print(len(req2.text))     # This is the size of the mdi file

with open("testFile.mid", "wb") as saveMidi:
    saveMidi.write(req2.content)