Question

我有一个脚本正在寻找Web文本页面中的信息，然后将它们存储在字典中。该脚本在列表中查找URL，然后在循环中处理它们，但是在此过程中由于此错误而中断：

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 406, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 444, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 300: Multiple Choices

我有解释的问题，我不知道是否有办法避免这种问题。有没有办法在脚本中添加例外？

这是我的剧本：

import urllib2
import sys
import re

IDlist = ['C9JVZ1', 'C9JLN0', 'C9J872']  #(there is more than 1500 of them)

URLlist = ["http://www.uniprot.org/uniprot/"+x+".txt" for x in IDlist]

function_list = {}
for id, item in zip(IDlist, URLlist):
    function_list[id] = []
    textfile = urllib2.urlopen(item);
    myfile = textfile.readlines();
    for line in myfile:
        print "line:", line;
        found = re.search('\s[C]:(.+?);', line);
        if found:
            function = found.group(1);
            function_list[id].append(function)

Answer 1

Web服务器正在为您要访问的其中一个网址返回HTTP状态代码300 Multiple Choices（请参阅Wikipedia）。这可能意味着列表中的一个URL错误，Web服务器希望通过提供类似的现有URL列表来帮助您。

一般情况下，urllib2会将任何不成功的内容或简单的重定向响应变为异常，而这就是您所看到的内容。

当你在某个地方处理异常时，例如使用try-except块通常会终止您的程序。所以你需要在try块中包含你对urlopen的调用：

try:
  textfile = urllib2.urlopen(item);
except urllib2.HTTPError:
  # Do something here to handle the error. For example:
  print("URL", item, "could not be read.")
  continue

Python：urllib2.HTTPError：HTTP错误300：多个选择

1 个答案: