如何在获取twitter stream api时重新启动龙卷风中的ioloop?

时间:2012-10-05 05:20:32

标签: python twitter tornado

我正在使用TweetStream(https://github.com/joshmarshall/TweetStream),这是一个基于龙卷风的Twitter流媒体模块来监控流API。

我想知道如果想要更改跟踪的单词,我该如何重新启动获取过程。

我当前的解决方案(不完全是解决方案)给了​​我一些错误。

stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop)

stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback)


def check_words():
    global words
    with open('words.txt') as file:
        newwords = file.read()
        if words != newwords:
            words = newwords
        try:
            print newwords
            stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback)
        except:
            pass
        file.close()

interval_ms = 1000*10
scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop)
scheduler.start()
main_io_loop.start()

这是我得到的错误

ERROR:root:Uncaught exception, closing connection.
Traceback (most recent call last):
  File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 305, in wrapper
    callback(*args)
  File "/home/user/PycharmProjects/observrenv/src/tweetstream/tweetstream.py", line 155, in on_connect
    self._twitter_stream.read_until("\r\n\r\n", self.on_headers)
  File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 151, in read_until
    self._set_read_callback(callback)
  File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 369, in _set_read_callback
    assert not self._read_callback, "Already reading"
AssertionError: Already reading
ERROR:root:Exception in callback <tornado.stack_context._StackContextWrapper object at 0x2415cb0>
Traceback (most recent call last):
  File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/ioloop.py", line 421, in _run_callback
    callback()
  File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 305, in wrapper
    callback(*args)
  File "/home/user/PycharmProjects/observrenv/src/tweetstream/tweetstream.py", line 155, in on_connect
    self._twitter_stream.read_until("\r\n\r\n", self.on_headers)
  File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 151, in read_until
    self._set_read_callback(callback)
  File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 369, in _set_read_callback
    assert not self._read_callback, "Already reading"
AssertionError: Already reading

通过调用check_words再次启动ioloop,我获得了更好的结果(没有最好)。

stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop)

stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback)


def check_words():
    global words, stream
    with open('words.txt') as file:
        newwords = file.read()
    if words != newwords:
        words = newwords
        print newwords
        try:
            stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop)
            stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback)
            interval_ms = 1000*10
            scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop)
            scheduler.start()
            main_io_loop.start()
        except:
            pass
        file.close()


interval_ms = 1000*10
scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop)
scheduler.start()
main_io_loop.start()

2 个答案:

答案 0 :(得分:1)

正如Twitter员工所说的那样here,建议做我正在做的事情(但是以更加适度的方式)。如果您的查询字词发生变化,只需重新连接一次即可。否则只需保持连接打开。监控Twitter可能发送给您的错误或者您可能会被禁止也很重要。

答案 1 :(得分:0)

看起来,就像你缺少Streaming API的主要思想一样。 与它的连接永久打开。

stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop)

#What you are doing in callback? 
stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback)


def check_words():
    #I guess, don't do it at all. 
    #global words
    #with open('words.txt') as file:
    #    newwords = file.read()
    #    if words != newwords:
    #        words = newwords
    #    try:
    #        #Don't open new stream here
    #        print newwords
    #    except:
    #        pass
    #    file.close()
    pass

interval_ms = 1000*10
scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop)
scheduler.start()
main_io_loop.start()

通过分析你的代码,我认为你必须在回调中使用新单词做例程。