我正在尝试从http://m.finnkino.fi/events/now_showing获取一些数据,但此刻我失败了,因为我甚至无法使用python加载页面源。 目前我正在使用以下代码:
req = urllib2.urlopen(URL,None,2.5)
page = req.read()
print page
以下是超时错误的追溯:
Traceback (most recent call last):
File "user/src/finnkinoParser.py", line 26, in <module>
main()
File "user/src/finnkinoParser.py", line 13, in main
getNowPlayingMovies()
File "user/src/finnkinoParser.py", line 17, in getNowPlayingMovies
req = urllib2.urlopen(baseURL,None,2.5)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 124, in urlopen
return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 383, in open
response = self._open(req, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 401, in _open
'_open', req)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 361, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 1130, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 1105, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error timed out>
如果我使用浏览器浏览网址,则可以正常使用。所以有人可以告诉我是什么让该网站有很大不同,因此urllib2无法加载页面。我认为这与针对移动用户的网站有关。使用“常规”网站urllib2工作正常。是否存在基本urlopen(URL)不起作用的其他类型的网站?
感谢您的帮助
答案 0 :(得分:3)
以下代码段工作正常。
import httplib
headers = {"User-Agent": "Mozilla/5.0"}
conn = httplib.HTTPConnection("m.finnkino.fi")
conn.request("GET", "/events/now_showing", "", headers)
response = conn.getresponse()
print response.status, response.reason
data = response.read()
print data
conn.close()
似乎他们的服务器已经验证了几个请求变量。经过一段时间的测试,结论是:
在urllib2中,HTTPHandler中的Connection prop默认设置为Close(urllib2.py中的L1127)。您可以使用urlgrabber或其他支持HTTP / 1.1和keep-alive的HTTP处理程序。