循环python中的异步API调用

时间:2018-04-21 07:28:58

标签: python async-await python-asyncio

我正在尝试使用Tweepy的{​​{1}} api调用来存储对象列表。每次调用statuses_lookup都会获取一个ID列表,最多可包含100个ID。

下面的这个函数包含一个ID列表,我试图将API调用返回的所有元数据附加到statuses_lookup列表中。

tweetData

这是进行api调用的异步函数

def lookupTweets(self, tweetIds):
    tweetData = []
    i = 0
    while i < len(tweetIds):
        print(i)
        if len(tweetIds) - i > 0:
            statuses = self.status_lookup(tweetIds[i + 99])
        else:
            statuses = self.status_lookup(tweetIds[i, len(tweetIds) - i])

        tweetData.append(statuses)
        i += 100

    return tweetData

以下是主要方法:

async def status_lookup(self, tweets):
        return self.api.statuses_lookup(tweets)

当我打印if __name__ == "__main__": twitterEngine = TwitterEngine() tweets = twitterEngine.ingestData("democratic-candidate-timelines.txt") twitterData = twitterEngine.lookupTweets(tweets) loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.wait(twitterData)) print(twitterData) 的结果时,我会得到twitterData的列表。输出看起来像这样:coroutine objects。但是,我想要实际的元数据而不是协程对象。

我不熟悉Python中的异步编程,我们非常感谢任何指导!

2 个答案:

答案 0 :(得分:3)

  

当我打印twitterData的结果时,我会得到coroutine object的列表。

调用协同程序函数只会创建协同程序对象,就像调用生成器只需创建生成器对象一样。要从coroutine对象获取实际数据,您需要等待来自另一个协程,或者在事件循环中运行它。如果是status_lookuplookupTweets本身应该是async def协程,它应该await状态:

statuses = await self.status_lookup(tweetIds[i + 99])

同样适用于status_lookup

async def status_lookup(self, tweets):
    return await self.api.statuses_lookup(tweets)

最外面的协同程序的返回值将由run_until_complete返回:

loop = asyncio.get_event_loop()
twitterData = loop.run_until_complete(twitterEngine.lookupTweets(tweets))    
print(twitterData)

答案 1 :(得分:1)

协程对象(调用ConnectionFailed: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10)) 2018-04-21 13:36:04,655 [ERROR] mongo_connector.oplog_manager:678 - OplogThread: Failed during dump collection cannot recover! Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset=u'rs0'), u'local'), u'oplog.rs') 2018-04-21 13:36:05,652 [ERROR] mongo_connector.connector:398 - MongoConnector: OplogThread <OplogThread(Thread-3, started 139952213964544)> unexpectedly stopped! Shutting down [ec2-user@ip-172-31-23-11 ~]$ cat /home/ec2-user/mongo-connector.log | less self._raise_timeout(err=e, url=url, timeout_value=read_timeout) File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 309, in _raise_timeout raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value) ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10) 2018-04-21 13:36:38,652 [WARNING] elasticsearch:88 - POST http://localhost:9200/_refresh [status:N/A request:10.011s] Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 95, in perform_request response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw) File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 639, in urlopen _stacktrace=sys.exc_info()[2]) File "/usr/lib/python2.7/site-packages/urllib3/util/retry.py", line 333, in increment raise six.reraise(type(error), error, _stacktrace) File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 601, in urlopen chunked=chunked) File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 389, in _make_request self._raise_timeout(err=e, url=url, timeout_value=read_timeout) File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 309, in _raise_timeout raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value) ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10) 2018-04-21 13:36:49,664 [WARNING] elasticsearch:88 - POST http://localhost:9200/_refresh [status:N/A request:10.011s] Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 95, in perform_request response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw) File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 639, in urlopen _stacktrace=sys.exc_info()[2]) File "/usr/lib/python2.7/site-packages/urllib3/util/retry.py", line 333, in increment raise six.reraise(type(error), error, _stacktrace) File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 601, in urlopen chunked=chunked) File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 389, in _make_request self._raise_timeout(err=e, url=url, timeout_value=read_timeout) File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 309, in _raise_timeout raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value) ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10) 函数的结果)需要与期货相关联才能访问返回的值。

有几种方法可以做到这一点,但是如果你有一个协程对象列表,你可以使用async def

asyncio.gather