我正在研究一个小的Python脚本,该脚本使用wget从我的大学的网站下载了一些文件。我感兴趣的文件是按顺序存储的,其中第一个文件的链接是: http://vlibcm.mmu.edu.my//xzamp/gxzam.php?action=37158.pdf
,然后链接到其他文件,如下所示:
http://vlibcm.mmu.edu.my//xzamp/gxzam.php?action=37159.pdf
http://vlibcm.mmu.edu.my//xzamp/gxzam.php?action=37160.pdf
,因此文件号每次增加1。
这是我为此编写的Python脚本:
import wget, os, sys
BaseURL = "http://vlibcm.mmu.edu.my//xzamp/gxzam.php?action="
BaseNumber= 37158
FullURL = ''
DownloadLocation = os.path.dirname(os.path.realpath(__file__))
for i in range (3):
FullURL = BaseURL + str(BaseNumber + i) + '.pdf'
print ("Working on:", FullURL)
wget.download(FullURL, DownloadLocation)
运行此脚本时,出现以下错误:
Downloading currently: http://vlibcm.mmu.edu.my//xzamp/gxzam.php?action=37158.pdf
Traceback (most recent call last):
File "main.py", line 12, in <module> wget.download(FullURL, DownloadLocation)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\site-packages\wget.py", line 526, in download
(tmpfile, headers) = ulib.urlretrieve(binurl, tmpfile, callback) File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 222, in urlopen return opener.open(url, data, timeout)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\MainUser\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 745, in http_error_302
self.inf_msg + msg, headers, fp)
urllib.error.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Found
我要注意两点:
答案 0 :(得分:0)
urllib中的错误显示HTTP错误302。在此处https://en.wikipedia.org/wiki/HTTP_302了解有关HTTP错误302的更多信息。
您可以先在浏览器上进行调试,以确保文件在服务器上可用。