我想从网页下载文件。该网页只有一个.zip文件(这就是我要下载的内容),但当我点击.zip文件时,它会开始下载,但网址不会更改(网址仍为http://ldn2800:8080/id=2800形式)。考虑到没有http://example.com/1.zip
形式的网址?
此外,当我直接转到页面http://ldn2800:8080/id=2800时,它只会打开带有.zip文件的页面,但不会在没有单击的情况下下载它。如何使用python下载它?
更新:现在我这样做:
if (str(dict.get('id')) == winID):
#or str(dict.get('id')) == linuxID):
#if str(dict.get('number')) == buildNo:
buildTypeId = dict.get('id')
ID = dict.get('id')
downloadURL = "http://example:8080/viewType.html?buildId=26009&tab=artifacts&buildTypeId=" + ID
directory = BindingsDest + "\\" + buildNo
if not os.path.exists(directory):
os.makedirs(directory)
fileName = None
if buildTypeId == linuxID:
fileName = linuxLib + "-" + buildNo + ".zip"
elif buildTypeId == winID:
fileName = winLib + "-" + buildNo + ".zip"
if fileName is not None:
print(dict)
downloadFile(downloadURL, directory, fileName)
def downloadFile(downloadURL, directory, fileName, user=user, password=password):
if user is not None and password is not None:
request = requests.get(downloadURL, stream=True, auth=(user, password))
else:
request = requests.get(downloadURL, stream=True)
with open(directory + "\\" + fileName, 'wb') as handle:
for block in request.iter_content(1024):
if not block:
break
handle.write(block)
但是,它只是在所需位置创建了一个拉链,但该拉链无法打开且什么也没有。 可以做这样的事情:比如在网页上搜索文件名,然后下载匹配的模式吗?
答案 0 :(得分:1)
检查HTTP状态代码以确保没有发生错误。您可以使用内置方法raise_for_status来执行此操作:https://requests.readthedocs.io/en/master/api/#requests.Response.raise_for_status
def downloadFile(downloadURL, directory, fileName, user=user, password=password):
if user is not None and password is not None:
request = requests.get(downloadURL, stream=True, auth=(user, password))
else:
request = requests.get(downloadURL, stream=True)
request.raise_for_status()
with open(directory + "\\" + fileName, 'wb') as handle:
for block in request.iter_content(1024):
if not block:
break
handle.write(block)
您确定没有网络问题,例如proxy / fw / etc?
编辑:根据您的上述评论,我不确定这是否能解决您的实际问题。修改回答:您访问包含zip文件链接的网页。您说,此链接与页面本身相同。但是如果您在浏览器中单击它,它会下载文件而不是再次访问HTML页面。这是奇怪,但可以通过各种方式解释。请复制/粘贴整个HTML页面代码(包括zip文件的链接),这可能有助于我们理解该问题。