如何在不使用全局变量的情况下在多线程中收集函数返回值?

时间:2016-11-18 22:18:59

标签: python multithreading

因此,我正在尝试制定一个通用解决方案,该解决方案将从函数中收集所有值,并将它们附加到稍后可访问的列表中。这将在concurrent.futuresthreading类型任务期间使用。以下是我使用全局master_list

的解决方案
from concurrent.futures import ThreadPoolExecutor

master_list = []
def return_from_multithreaded(func):
    # master_list = []
    def wrapper(*args, **kwargs):
        # nonlocal master_list
        global master_list
        master_list += func(*args, **kwargs)
    return wrapper


@return_from_multithreaded
def f(n):
    return [n]


with ThreadPoolExecutor(max_workers=20) as exec:
    exec.map(f, range(1, 100))

print(master_list)

我想找到一个不包含全局变量的解决方案,也许可以返回存储为闭包的注释掉的master_list

2 个答案:

答案 0 :(得分:2)

我过去曾遇到过这个问题:Running multiple asynchronous function and get the returned value of each function。这是我的方法:

def async_call(func_list):
    """
    Runs the list of function asynchronously.

    :param func_list: Expects list of lists to be of format
        [[func1, args1, kwargs1], [func2, args2, kwargs2], ...]
    :return: List of output of the functions
        [output1, output2, ...]
    """
    def worker(function, f_args, f_kwargs, queue, index):
        """
        Runs the function and appends the output to list, and the Exception in the case of error
        """
        response = {
            'index': index,  # For tracking the index of each function in actual list.
                             # Since, this function is called asynchronously, order in
                             # queue may differ
            'data': None,
            'error': None
        }

        # Handle error in the function call
        try:
            response['data'] = function(*f_args, **f_kwargs)
        except Exception as e:
            response['error'] = e  # send back the exception along with the queue

        queue.put(response)
    queue = Queue()
    processes = [Process(target=worker, args=(func, args, kwargs, queue, i)) \
                    for i, (func, args, kwargs) in enumerate(func_list)]

    for process in processes:
        process.start()

    response_list = []
    for process in processes:
        # Wait for process to finish
        process.join()

        # Get back the response from the queue
        response = queue.get()
        if response['error']:
            raise response['error']   # Raise exception if the function call failed
        response_list.append(response)

    return [content['data'] for content in sorted(response_list, key=lambda x: x['index'])]

示例运行:

def my_sum(x, y):
    return x + y

def your_mul(x, y):
    return x*y

my_func_list = [[my_sum, [1], {'y': 2}], [your_mul, [], {'x':1, 'y':2}]]

async_call(my_func_list)
# Value returned: [3, 2]

答案 1 :(得分:2)

如果您不想使用全局变量,请不要丢弃map的结果。 map正在返回每个函数返回的值,您只是忽略它们。通过将map用于其预期目的,可以使此代码更简单:

def f(n):
    return n  # No need to wrap in list

with ThreadPoolExecutor(max_workers=20) as exec:
    master_list = list(exec.map(f, range(1, 100)))

print(master_list)

如果你需要一个显示到目前为止计算结果的master_list(也许其他一些线程正在观察它),你只需要明确循环:

def f(n):
    return n  # No need to wrap in list

master_list = []
with ThreadPoolExecutor(max_workers=20) as exec:
    for result in exec.map(f, range(1, 100)):
        master_list.append(result)

print(master_list)

这是Executor模型的设计目标;普通线程并不打算返回值,但Executors提供了一个返回值的通道,因此您不必自己管理它。在内部,这是使用某种形式的队列,使用额外的元数据来保持结果有序,但您不需要处理这种复杂性;从您的角度来看,它等同于常规map函数,它恰好可以并行化工作。

更新以涵盖处理例外情况:

当结果被击中时,

map将提出工人中提出的任何例外情况。因此,如上所述,如果任何任务失败,第一组代码将不会存储任何内容(list将部分构造,但在异常引发时丢弃)。第二个示例仅在抛出第一个异常之前保留结果,其余的被丢弃(您必须存储map迭代器并使用一些笨拙的代码来避免它)。如果您需要存储所有成功的结果,忽略失败(或只是以某种方式记录它们),最简单的方法是submit创建listFuture个对象,然后按顺序或按完成顺序等待它们,将.result()个电话包裹在try / except中,以避免丢掉好的结果。例如,要按提交顺序存储结果,您可以:

master_list = []
with ThreadPoolExecutor(max_workers=20) as exec:
    futures = [exec.submit(f, i) for i in range(1, 100)]
    exec.shutdown(False)  # Optional: workers terminate as soon as all futures finish,
                          # rather than waiting for all results to be processed
    for fut in futures:
        try:
            master_list.append(fut.result())
        except Exception:
            ... log error here ...

对于更高效的代码,您可以按照完成而不是提交的顺序检索结果,使用concurrent.futures.as_completed在结束时急切地检索结果。与前一代码的唯一变化是:

    for fut in futures:

变为:

    for fut in concurrent.futures.as_completed(futures):

其中as_completed完成yield完成/取消期货的工作,而不是推迟到之前提交的所有期货完成并得到处理。

有更复杂的选项涉及使用add_done_callback,因此主要线程根本不涉及明确处理结果,但这通常是不必要的,而且往往令人困惑,所以它&#39最好尽可能避免。