我使用dryscrape从不同的页面中抓取一些HTML数据。它是django应用程序的所有部分,但我发现在使用python shell时也出现了这个问题。第二次连接问题。我正在使用:
Python 2.7.6 (default, Mar 4 2014, 13:14:52)
dryscrape Version: 0.9
webkit-server Version: 1.0
xvfbwrapper Version: 0.2.5
下面你可以看到我想用它的方式
Python 2.7.6 (default, Mar 4 2014, 13:14:52)
Type "copyright", "credits" or "license" for more information.
IPython 2.1.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: import dryscrape
In [2]: from xvfbwrapper import Xvfb
In [3]: x = Xvfb()
In [4]: x.start()
In [5]: session = dryscrape.Session(base_url='http://google.com')
In [6]: session.visit('')
In [7]: session.url()
Out[7]: u'http://www.google.pl/?gfe_rd=cr&ei=d95qVvLfFc2v8wfamoG4Aw'
In [8]: x.stop()
现在一切都很好。但是,如果我尝试继续,另一个会议
...
In [8]: x.stop()
In [9]: x = Xvfb()
In [10]: x.start()
In [11]: session = dryscrape.Session(base_url='http://google.com')
---------------------------------------------------------------------------
error Traceback (most recent call last)
<ipython-input-11-6cbe39a8459d> in <module>()
----> 1 session = dryscrape.Session(base_url='http://google.com')
/home/mefioo/public_html/kariera_naukowa/env/lib/python2.7/site-packages/dryscrape/session.pyc in __init__(self, driver, base_url)
16 driver = None,
17 base_url = None):
---> 18 self.driver = driver or DefaultDriver()
19 self.base_url = base_url
20
/home/mefioo/public_html/kariera_naukowa/env/lib/python2.7/site-packages/dryscrape/driver/webkit.pyc in __init__(self, **kw)
28 def __init__(self, **kw):
29 kw.setdefault('node_factory_class', NodeFactory)
---> 30 super(Driver, self).__init__(**kw)
/home/mefioo/public_html/kariera_naukowa/env/lib/python2.7/site-packages/webkit_server.pyc in __init__(self, connection, node_factory_class)
228 node_factory_class = NodeFactory):
229 super(Client, self).__init__()
--> 230 self.conn = connection or ServerConnection()
231 self._node_factory = node_factory_class(self)
232
/home/mefioo/public_html/kariera_naukowa/env/lib/python2.7/site-packages/webkit_server.pyc in __init__(self, server)
505 def __init__(self, server = None):
506 super(ServerConnection, self).__init__()
--> 507 self._sock = (server or get_default_server()).connect()
508 self.buf = SocketBuffer(self._sock)
509 self.issue_command("IgnoreSslErrors")
/home/mefioo/public_html/kariera_naukowa/env/lib/python2.7/site-packages/webkit_server.pyc in connect(self)
438 """ Returns a new socket connection to this server. """
439 sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
--> 440 sock.connect(("127.0.0.1", self._port))
441 return sock
442
/usr/local/lib/python2.7/socket.pyc in meth(name, self, *args)
222
223 def meth(name,self,*args):
--> 224 return getattr(self._sock,name)(*args)
225
226 for _m in _socketmethods:
error: [Errno 111] Connection refused
我这样做只是为了举例,因为在我的django应用程序中它是视图逻辑的一部分,并且第二次请求该视图会导致此错误。重新启动django服务器或python shell解决了它,但仅限于第一次连接,因此对于工作网页来说它是无用的。我错过了一些&#34;清洁&#34;或者&#34;重启&#34;这两个之间的X会话或webkit-server(capibara-webkit)?
答案 0 :(得分:0)
好吧,这不是一个“真正的”答案,因为我仍然不知道出了什么问题,但我找到了一种方法让这个有效。我已将dryscrape升级到1.0并使用了新方法dryscrape.start_xvfb()
而不是xvfbwrapper Xvfb()
。一切都很好。