我在python中使用urllib2获取url内容,并将它们置于python的本机html解析器中。代码在我的python 2.7.4上运行得非常好,但是,我朋友的机器有python 2.6.9并且他的机器面临的问题是:
Traceback (most recent call last):
File "opsview_audit.py", line 420, in <module>
check_instances_against_regex(instances)
File "opsview_audit.py", line 219, in check_instances_against_regex
attrs_being_monitored = get_host_monitoring_status(cred['url'], running_instances,
cred['user_name'], cred['pass_key'])
File "opsview_audit.py", line 112, in get_host_monitoring_status
parser.feed(result.read())
File "/usr/lib64/python2.6/HTMLParser.py", line 108, in feed
self.goahead(0)
File "/usr/lib64/python2.6/HTMLParser.py", line 148, in goahead
k = self.parse_starttag(i)
File "/usr/lib64/python2.6/HTMLParser.py", line 229, in parse_starttag
endpos = self.check_for_whole_start_tag(i)
File "/usr/lib64/python2.6/HTMLParser.py", line 304, in check_for_whole_start_tag
self.error("malformed start tag")
File "/usr/lib64/python2.6/HTMLParser.py", line 115, in error
raise HTMLParseError(message, self.getpos())
HTMLParser.HTMLParseError: malformed start tag, at line 509, column 47
可能是某些开始标记不正确,这在python 2.6.9中被抛出作为例外,但不在2.7.4中 在这里,升级2.6.9至2.7.4或更高版本不是一种选择。
答案 0 :(得分:0)
两种解决方案:
- 使用另一个htmlparser,如Beautiful soup 3或lxml。它们都非常容易学习,并且可以使用python 2.6进行编程。
- 尝试查找错误并将其过滤掉。