我有机械化问题。从今天开始,机械化不想打开这个网站:
url = "http://dom.gratka.pl/"
这很有意思,因为它运作良好。
这是我的代码:
import mechanize
from bs4 import BeautifulSoup
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=10)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
url = "http://dom.gratka.pl/"
parsehtml = br.open(url)
我有一个特定的错误,我不知道我做错了什么
Traceback (most recent call last):
parsehtml = br.open(url)
File "C:\Python27\lib\site-packages\mechanize\_mechanize.py", line 203, in open
return self._mech_open(url, data, timeout=timeout)
File "C:\Python27\lib\site-packages\mechanize\_mechanize.py", line 230, in _mech_open
response = UserAgentBase.open(self, request, data)
File "C:\Python27\lib\site-packages\mechanize\_opener.py", line 204, in open
response = meth(req, response)
File "C:\Python27\lib\site-packages\mechanize\_http.py", line 201, in http_response
self.head_parser_class())
File "C:\Python27\lib\site-packages\mechanize\_http.py", line 171, in parse_head
parser.feed(data)
File "C:\Python27\lib\site-packages\mechanize\_sgmllib_copy.py", line 110, in feed
self.goahead(0)
File "C:\Python27\lib\site-packages\mechanize\_sgmllib_copy.py", line 192, in goahead
self.handle_charref(name)
File "C:\Python27\lib\site-packages\mechanize\_http.py", line 80, in handle_charref
self.handle_data(unescape_charref(name, self._encoding))
File "C:\Python27\lib\site-packages\mechanize\_html.py", line 317, in unescape_charref
uc = unichr(int(name, base))
ValueError: unichr() arg not in range(0x10000) (narrow Python build)