我有以下代码从Amazon的API请求:
params = {'Operation': 'GetRequesterStatistic', 'Statistic': 'NumberHITsAssignable', 'TimePeriod': 'LifeToDate'}
response = self.conn.make_request(action=None, params=params, path='/', verb='GET')
data['ActiveHITs'] = self.conn._process_response(response).LongValue
params = {'Operation': 'GetRequesterStatistic', 'Statistic': 'NumberAssignmentsPending', 'TimePeriod': 'LifeToDate'}
response = self.conn.make_request(action=None, params=params, path='/', verb='GET')
data['PendingAssignments'] = self.conn._process_response(response).LongValue
这些请求中的每一个都需要大约1秒等待Amazon返回数据。我如何并行运行这两个,所以它(理想情况下)需要1来运行,而不是2s?
答案 0 :(得分:1)
您可以使用multiprocessing.Pool
来并行化请求:
from multiprocessing import Pool
class Foo:
def __fetch(self, statistic):
params = {
'Operation': 'GetRequesterStatistic',
'Statistic': statistic,
'TimePeriod': 'LifeToDate'
}
response = self.conn.make_request(
action=None, params=params, path='/', verb='GET'
)
return self.conn._process_response(response).LongValue
def get_stats(self):
pool = Pool()
results = pool.map(self.__fetch, [
'NumberHITsAssignable', 'NumberAssignmentsPending'
])
data['ActiveHITs'], data['PendingAssignments'] = results
这具有能够并行化任何给定数量的请求的良好效果。默认情况下,会创建每个核心的工作者,您可以通过将参数传递给Pool
来更改该数字。