我正在尝试使用Tweepy
的{{1}} api调用来存储对象列表。每次调用statuses_lookup
都会获取一个ID列表,最多可包含100个ID。
下面的这个函数包含一个ID列表,我试图将API调用返回的所有元数据附加到statuses_lookup
列表中。
tweetData
这是进行api调用的异步函数
def lookupTweets(self, tweetIds):
tweetData = []
i = 0
while i < len(tweetIds):
print(i)
if len(tweetIds) - i > 0:
statuses = self.status_lookup(tweetIds[i + 99])
else:
statuses = self.status_lookup(tweetIds[i, len(tweetIds) - i])
tweetData.append(statuses)
i += 100
return tweetData
以下是主要方法:
async def status_lookup(self, tweets):
return self.api.statuses_lookup(tweets)
当我打印if __name__ == "__main__":
twitterEngine = TwitterEngine()
tweets = twitterEngine.ingestData("democratic-candidate-timelines.txt")
twitterData = twitterEngine.lookupTweets(tweets)
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(twitterData))
print(twitterData)
的结果时,我会得到twitterData
的列表。输出看起来像这样:coroutine objects
。但是,我想要实际的元数据而不是协程对象。
我不熟悉Python中的异步编程,我们非常感谢任何指导!
答案 0 :(得分:3)
当我打印
twitterData
的结果时,我会得到coroutine object
的列表。
调用协同程序函数只会创建协同程序对象,就像调用生成器只需创建生成器对象一样。要从coroutine对象获取实际数据,您需要等待来自另一个协程,或者在事件循环中运行它。如果是status_lookup
,lookupTweets
本身应该是async def
协程,它应该await
状态:
statuses = await self.status_lookup(tweetIds[i + 99])
同样适用于status_lookup
:
async def status_lookup(self, tweets):
return await self.api.statuses_lookup(tweets)
最外面的协同程序的返回值将由run_until_complete
返回:
loop = asyncio.get_event_loop()
twitterData = loop.run_until_complete(twitterEngine.lookupTweets(tweets))
print(twitterData)
答案 1 :(得分:1)
协程对象(调用ConnectionFailed: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))
2018-04-21 13:36:04,655 [ERROR] mongo_connector.oplog_manager:678 - OplogThread: Failed during dump collection cannot recover! Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset=u'rs0'), u'local'), u'oplog.rs')
2018-04-21 13:36:05,652 [ERROR] mongo_connector.connector:398 - MongoConnector: OplogThread <OplogThread(Thread-3, started 139952213964544)> unexpectedly stopped! Shutting down
[ec2-user@ip-172-31-23-11 ~]$ cat /home/ec2-user/mongo-connector.log | less
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 309, in _raise_timeout
raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10)
2018-04-21 13:36:38,652 [WARNING] elasticsearch:88 - POST http://localhost:9200/_refresh [status:N/A request:10.011s]
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 95, in perform_request
response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python2.7/site-packages/urllib3/util/retry.py", line 333, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 389, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 309, in _raise_timeout
raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10)
2018-04-21 13:36:49,664 [WARNING] elasticsearch:88 - POST http://localhost:9200/_refresh [status:N/A request:10.011s]
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 95, in perform_request
response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python2.7/site-packages/urllib3/util/retry.py", line 333, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 389, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/usr/lib/python2.7/site-packages/urllib3/connectionpool.py", line 309, in _raise_timeout
raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10)
函数的结果)需要与期货相关联才能访问返回的值。
有几种方法可以做到这一点,但是如果你有一个协程对象列表,你可以使用async def
:
asyncio.gather