瓶子关闭

时间:2012-03-14 11:58:56

标签: python scrapy bottle

我使用python 2.7.2,瓶子0.10.9和“瑞士军刀”scrapy 0.14.1编写简单的REST API。

简单地说,只有一种方法(myserver:8081 / doparse?address =“url”)启动使用scrapy抓取url并在json中返回响应。

使用bottle内置服务器部署脚本时,我得到以下输出:

    Shutdown...
Traceback (most recent call last):
  File "parser/main.py", line 67, in <module>
    run(host='ks205512.kimsufi.com', port=8081)
  File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 2391, in run
    server.run(app)
  File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 2089, in run
    srv.serve_forever()
  File "/usr/lib/python2.6/SocketServer.py", line 224, in serve_forever
    r, w, e = select.select([self], [], [], poll_interval)
select.error: (4, 'Interrupted system call')

将瓶子与cherrypy等其他服务器一起使用会产生其他错误,例如:

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 737, in _handle
    return route.call(**args)
  File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 1456, in wrapper
    rv = callback(*a, **ka)
  File "parser/main.py", line 20, in parse
    return parse_url(url)
  File "parser/main.py", line 35, in parse_url
    items = crawler.start(url) # launching crawler
  File "/home/projects/linkedinparser/parser/crawler.py", line 140, in start
    crawler = CrawlerWorker(LinkedinSpider(url), results)
  File "/home/projects/linkedinparser/parser/crawler.py", line 85, in __init__
    self.crawler = CrawlerProcess(settings)
  File "/usr/local/lib/python2.6/dist-packages/scrapy/crawler.py", line 69, in __init__
    install_shutdown_handlers(self._signal_shutdown)
  File "/usr/local/lib/python2.6/dist-packages/scrapy/utils/ossignal.py", line 21, in install_shutdown_handlers
    reactor._handleSignals()
  File "/usr/local/lib/python2.6/dist-packages/twisted/internet/posixbase.py", line 292, in _handleSignals
    _SignalReactorMixin._handleSignals(self)
  File "/usr/local/lib/python2.6/dist-packages/twisted/internet/base.py", line 1129, in _handleSignals
    signal.signal(signal.SIGINT, self.sigInt)
ValueError: signal only works in main thread

我会感激任何帮助。 感谢

1 个答案:

答案 0 :(得分:2)

默认情况下,默认反应器将安装信号处理程序以捕获Ctrl-C,SIGTERM等事件。但是,您无法在Python中从非主线程安装信号处理程序,这意味着reactor.run()将导致错误。将installSignalHandlers=0关键字参数传递给reactor.run以解决此问题。