Question

我有以下代码从Amazon的API请求：

params = {'Operation': 'GetRequesterStatistic', 'Statistic': 'NumberHITsAssignable', 'TimePeriod': 'LifeToDate'}
response = self.conn.make_request(action=None, params=params, path='/', verb='GET')
data['ActiveHITs'] = self.conn._process_response(response).LongValue

params = {'Operation': 'GetRequesterStatistic', 'Statistic': 'NumberAssignmentsPending', 'TimePeriod': 'LifeToDate'}
response = self.conn.make_request(action=None, params=params, path='/', verb='GET')
data['PendingAssignments'] = self.conn._process_response(response).LongValue

这些请求中的每一个都需要大约1秒等待Amazon返回数据。我如何并行运行这两个，所以它（理想情况下）需要1来运行，而不是2s？

Answer 1

您可以使用multiprocessing.Pool来并行化请求：

from multiprocessing import Pool

class Foo:
    def __fetch(self, statistic):
        params = {
            'Operation': 'GetRequesterStatistic',
            'Statistic': statistic,
            'TimePeriod': 'LifeToDate'
        }
        response = self.conn.make_request(
            action=None, params=params, path='/', verb='GET'
        )
        return self.conn._process_response(response).LongValue

    def get_stats(self):
        pool = Pool()
        results = pool.map(self.__fetch, [
            'NumberHITsAssignable', 'NumberAssignmentsPending'
        ])
        data['ActiveHITs'], data['PendingAssignments'] = results

这具有能够并行化任何给定数量的请求的良好效果。默认情况下，会创建每个核心的工作者，您可以通过将参数传递给Pool来更改该数字。

如何并行执行两个请求

1 个答案: