Question

因此，我正在尝试制定一个通用解决方案，该解决方案将从函数中收集所有值，并将它们附加到稍后可访问的列表中。这将在concurrent.futures或threading类型任务期间使用。以下是我使用全局master_list：

的解决方案

from concurrent.futures import ThreadPoolExecutor

master_list = []
def return_from_multithreaded(func):
    # master_list = []
    def wrapper(*args, **kwargs):
        # nonlocal master_list
        global master_list
        master_list += func(*args, **kwargs)
    return wrapper


@return_from_multithreaded
def f(n):
    return [n]


with ThreadPoolExecutor(max_workers=20) as exec:
    exec.map(f, range(1, 100))

print(master_list)

我想找到一个不包含全局变量的解决方案，也许可以返回存储为闭包的注释掉的master_list？

Answer 1

我过去曾遇到过这个问题：Running multiple asynchronous function and get the returned value of each function。这是我的方法：

def async_call(func_list):
    """
    Runs the list of function asynchronously.

    :param func_list: Expects list of lists to be of format
        [[func1, args1, kwargs1], [func2, args2, kwargs2], ...]
    :return: List of output of the functions
        [output1, output2, ...]
    """
    def worker(function, f_args, f_kwargs, queue, index):
        """
        Runs the function and appends the output to list, and the Exception in the case of error
        """
        response = {
            'index': index,  # For tracking the index of each function in actual list.
                             # Since, this function is called asynchronously, order in
                             # queue may differ
            'data': None,
            'error': None
        }

        # Handle error in the function call
        try:
            response['data'] = function(*f_args, **f_kwargs)
        except Exception as e:
            response['error'] = e  # send back the exception along with the queue

        queue.put(response)
    queue = Queue()
    processes = [Process(target=worker, args=(func, args, kwargs, queue, i)) \
                    for i, (func, args, kwargs) in enumerate(func_list)]

    for process in processes:
        process.start()

    response_list = []
    for process in processes:
        # Wait for process to finish
        process.join()

        # Get back the response from the queue
        response = queue.get()
        if response['error']:
            raise response['error']   # Raise exception if the function call failed
        response_list.append(response)

    return [content['data'] for content in sorted(response_list, key=lambda x: x['index'])]

示例运行：

def my_sum(x, y):
    return x + y

def your_mul(x, y):
    return x*y

my_func_list = [[my_sum, [1], {'y': 2}], [your_mul, [], {'x':1, 'y':2}]]

async_call(my_func_list)
# Value returned: [3, 2]

Answer 2

如果您不想使用全局变量，请不要丢弃map的结果。 map正在返回每个函数返回的值，您只是忽略它们。通过将map用于其预期目的，可以使此代码更简单：

def f(n):
    return n  # No need to wrap in list

with ThreadPoolExecutor(max_workers=20) as exec:
    master_list = list(exec.map(f, range(1, 100)))

print(master_list)

如果你需要一个显示到目前为止计算结果的master_list（也许其他一些线程正在观察它），你只需要明确循环：

def f(n):
    return n  # No need to wrap in list

master_list = []
with ThreadPoolExecutor(max_workers=20) as exec:
    for result in exec.map(f, range(1, 100)):
        master_list.append(result)

print(master_list)

这是Executor模型的设计目标;普通线程并不打算返回值，但Executors提供了一个返回值的通道，因此您不必自己管理它。在内部，这是使用某种形式的队列，使用额外的元数据来保持结果有序，但您不需要处理这种复杂性;从您的角度来看，它等同于常规map函数，它恰好可以并行化工作。

更新以涵盖处理例外情况：

当结果被击中时，

map将提出工人中提出的任何例外情况。因此，如上所述，如果任何任务失败，第一组代码将不会存储任何内容（list将部分构造，但在异常引发时丢弃）。第二个示例仅在抛出第一个异常之前保留结果，其余的被丢弃（您必须存储map迭代器并使用一些笨拙的代码来避免它）。如果您需要存储所有成功的结果，忽略失败（或只是以某种方式记录它们），最简单的方法是submit创建list个Future个对象，然后按顺序或按完成顺序等待它们，将.result()个电话包裹在try / except中，以避免丢掉好的结果。例如，要按提交顺序存储结果，您可以：

master_list = []
with ThreadPoolExecutor(max_workers=20) as exec:
    futures = [exec.submit(f, i) for i in range(1, 100)]
    exec.shutdown(False)  # Optional: workers terminate as soon as all futures finish,
                          # rather than waiting for all results to be processed
    for fut in futures:
        try:
            master_list.append(fut.result())
        except Exception:
            ... log error here ...

对于更高效的代码，您可以按照完成而不是提交的顺序检索结果，使用concurrent.futures.as_completed在结束时急切地检索结果。与前一代码的唯一变化是：

    for fut in futures:

变为：

    for fut in concurrent.futures.as_completed(futures):

其中as_completed完成yield完成/取消期货的工作，而不是推迟到之前提交的所有期货完成并得到处理。

有更复杂的选项涉及使用add_done_callback，因此主要线程根本不涉及明确处理结果，但这通常是不必要的，而且往往令人困惑，所以它＆＃39最好尽可能避免。

如何在不使用全局变量的情况下在多线程中收集函数返回值？

2 个答案: