Question

我正在尝试使用this website下载ZIP文件。我查看了其他questions like this，尝试使用requests和urllib，但我得到了同样的错误：

urllib.error.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop. The last 30x error message was: Found

有关如何直接从网络上打开文件的任何想法？

以下是一些示例代码

from urllib.request import urlopen
response = urlopen('http://www1.caixa.gov.br/loterias/_arquivos/loterias/D_megase.zip')

Answer 1

链接的网址会无限期重定向，这就是您收到302错误的原因。

您可以自己检查over here。正如您所看到的，链接的url会立即重定向到自身，从而创建单个url循环。

Answer 2

使用请求库

为我工作

import requests

url = 'http://www1.caixa.gov.br/loterias/_arquivos/loterias/D_megase.zip'

response = requests.get(url)

# Unzip it into a local directory if you want
import zipfile, io

zip = zipfile.ZipFile(io.BytesIO(response.content))
zip.extractall("/path/to/your/directory")

请注意，有时尝试以编程方式访问网页会导致302响应，因为他们只希望您通过网络浏览器访问该网页。如果您需要假装（不要滥用），只需设置“用户代理”即可。标题就像一个浏览器。以下是将请求看作是来自Chrome浏览器的示例。

user_agent = 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1667.0 Safari/537.36'
headers = {'User-Agent': user_agent}
requests.get(url, headers=headers)

有几个库（例如https://pypi.org/project/fake-useragent/）可以帮助实现更广泛的抓取项目。

从网上下载ZIP文件（Python）

2 个答案: