Question

我有一个Python程序，如下所示：

functionForInsertingRows($aInc){
  update tableName set 
  measurement_name = $aInc["measurement_name"],
  measurement_last_updated = $aInc["measurement_last_updated"]
  where measurement_id = $aInc["measurement_id"]
}

函数'some_function_call'需要花费大量时间，我找不到一种简单的方法来减少函数的时间复杂度。有没有办法在执行并行任务时减少执行时间，然后在total_error中添加它们。我尝试使用pool和joblib，但无法成功使用它们。

Answer 1

您可以使用python multiprocessing：

from multiprocessing import Pool, freeze_support, cpu_count
import os

all_args = [(parameters1, parameters2) for i in range(24)]
#call freeze_support() if in windows
if os.name == "nt":
    freeze_support()
pool = Pool(cpu_count()) #you can use whatever, but your machine core number usually is a good choice (altough maybe not the better)

def wrapped_some_function_call(args): 
    """
    we need to wrap the call to unpack the parameters 
    we build before as a tuple for being able to use pool.map
    """ 
    sume_function_call(*args) 

results = pool.map(wrapped_some_function_call, all_args)
total_error = sum(results)

Answer 2

您还可以在Python 3中使用concurrent.futures，这是一个比multiprocessing更简单的界面。 See this了解有关差异的更多详细信息。

from concurrent import futures

total_error = 0

with futures.ProcessPoolExecutor() as pool:
  for error in pool.map(some_function_call, parameters1, parameters2):
    total_error += error

在这种情况下，parameters1和parameters2应该是与您希望运行该函数的次数相同的列表或可迭代次数（根据您的示例，为24次）。

如果paramters<1,2>不是可迭代/可映射的，但您只想运行该函数24次，则可以将该函数的作业提交所需的次数，然后使用回调获取结果。

class TotalError:
    def __init__(self):
        self.value = 0

    def __call__(self, r):
        self.value += r.result()

total_error = TotalError()
with futures.ProcessPoolExecutor() as pool:
  for i in range(24):
    future_result = pool.submit(some_function_call, parameters1, parameters2)
    future_result.add_done_callback(total_error)

print(total_error.value)

在Python中实现Parallel for循环

2 个答案: