Question

我需要加快一个有点慢的Python计算过程。当我使用常规代码时，它仅使用我的一个CPU。我需要此过程才能使用我的所有CPU。

因此，我发现使用并发模块中的ProcessPoolExecutor（）可以帮到我。

下面是描述此示例（原始示例为https://towardsdatascience.com/heres-how-you-can-get-a-2-6x-speed-up-on-your-data-pre-processing-with-python-847887e63be5）：

编写了Python函数，以将文件夹中所有图像的大小调整为600x600。因此，基本功能是：

import glob
import os
import cv2

for image_filename in glob.glob("*.jpg"):
    img = cv2.imread(image_filename)
    img = cv2.resize(img, (600, 600))

使用ProcessPoolExecutor（）并使其在6个CPU内核计算机上快6倍，代码如下：

import glob
import os
import cv2
import concurrent.futures

def load_and_resize(image_filename):
    img = cv2.imread(image_filename)
    img = cv2.resize(img, (600, 600))

with concurrent.futures.ProcessPoolExecutor() as executor:
    image_files = glob.glob("*.jpg")
    executor.map(load_and_resize, image_files)

好的，对我来说这很简单。

现在如何在我的案件中应用以上内容？

我的设置是这样的：

# basic function for performing the calculations
def slow_time_consuming_function(arg1, arg2, arg3, arg4):
    # do some slow calculations with arg1, arg2, arg3, arg4
    # return some float result

# list of arg1 values (usual length of 100-500) 
arg1_list = ['xx', 'xx1', 'xx2', ...]

# function for calculating the whole range of arg1 values
def multiple_arg1_values_calculation_function(arg1_list, arg2, arg3, arg4):
    # empty list for the results
    list_results = []

    # loop for calculating  the results
    for arg1 in arg1_list:
        list_results.append(slow_time_consuming_function(arg1, arg2, arg3, arg4))

    return list_results

我的问题是我无法为基本函数硬编码所有非arg1参数（其中一些是自定义创建的Python对象，等等）

那么我该如何转换我的代码以开始使用current.futures.ProcessPoolExecutor（）？

如何从带有多个参数函数的current.futures中使用ProcessPoolExecutor？

0 个答案: