因此,我正在尝试制定一个通用解决方案,该解决方案将从函数中收集所有值,并将它们附加到稍后可访问的列表中。这将在concurrent.futures
或threading
类型任务期间使用。以下是我使用全局master_list
:
from concurrent.futures import ThreadPoolExecutor
master_list = []
def return_from_multithreaded(func):
# master_list = []
def wrapper(*args, **kwargs):
# nonlocal master_list
global master_list
master_list += func(*args, **kwargs)
return wrapper
@return_from_multithreaded
def f(n):
return [n]
with ThreadPoolExecutor(max_workers=20) as exec:
exec.map(f, range(1, 100))
print(master_list)
我想找到一个不包含全局变量的解决方案,也许可以返回存储为闭包的注释掉的master_list
?
答案 0 :(得分:2)
我过去曾遇到过这个问题:Running multiple asynchronous function and get the returned value of each function。这是我的方法:
def async_call(func_list):
"""
Runs the list of function asynchronously.
:param func_list: Expects list of lists to be of format
[[func1, args1, kwargs1], [func2, args2, kwargs2], ...]
:return: List of output of the functions
[output1, output2, ...]
"""
def worker(function, f_args, f_kwargs, queue, index):
"""
Runs the function and appends the output to list, and the Exception in the case of error
"""
response = {
'index': index, # For tracking the index of each function in actual list.
# Since, this function is called asynchronously, order in
# queue may differ
'data': None,
'error': None
}
# Handle error in the function call
try:
response['data'] = function(*f_args, **f_kwargs)
except Exception as e:
response['error'] = e # send back the exception along with the queue
queue.put(response)
queue = Queue()
processes = [Process(target=worker, args=(func, args, kwargs, queue, i)) \
for i, (func, args, kwargs) in enumerate(func_list)]
for process in processes:
process.start()
response_list = []
for process in processes:
# Wait for process to finish
process.join()
# Get back the response from the queue
response = queue.get()
if response['error']:
raise response['error'] # Raise exception if the function call failed
response_list.append(response)
return [content['data'] for content in sorted(response_list, key=lambda x: x['index'])]
示例运行:
def my_sum(x, y):
return x + y
def your_mul(x, y):
return x*y
my_func_list = [[my_sum, [1], {'y': 2}], [your_mul, [], {'x':1, 'y':2}]]
async_call(my_func_list)
# Value returned: [3, 2]
答案 1 :(得分:2)
如果您不想使用全局变量,请不要丢弃map
的结果。 map
正在返回每个函数返回的值,您只是忽略它们。通过将map
用于其预期目的,可以使此代码更简单:
def f(n):
return n # No need to wrap in list
with ThreadPoolExecutor(max_workers=20) as exec:
master_list = list(exec.map(f, range(1, 100)))
print(master_list)
如果你需要一个显示到目前为止计算结果的master_list
(也许其他一些线程正在观察它),你只需要明确循环:
def f(n):
return n # No need to wrap in list
master_list = []
with ThreadPoolExecutor(max_workers=20) as exec:
for result in exec.map(f, range(1, 100)):
master_list.append(result)
print(master_list)
这是Executor模型的设计目标;普通线程并不打算返回值,但Executors提供了一个返回值的通道,因此您不必自己管理它。在内部,这是使用某种形式的队列,使用额外的元数据来保持结果有序,但您不需要处理这种复杂性;从您的角度来看,它等同于常规map
函数,它恰好可以并行化工作。
更新以涵盖处理例外情况:
当结果被击中时, map
将提出工人中提出的任何例外情况。因此,如上所述,如果任何任务失败,第一组代码将不会存储任何内容(list
将部分构造,但在异常引发时丢弃)。第二个示例仅在抛出第一个异常之前保留结果,其余的被丢弃(您必须存储map
迭代器并使用一些笨拙的代码来避免它)。如果您需要存储所有成功的结果,忽略失败(或只是以某种方式记录它们),最简单的方法是submit
创建list
个Future
个对象,然后按顺序或按完成顺序等待它们,将.result()
个电话包裹在try
/ except
中,以避免丢掉好的结果。例如,要按提交顺序存储结果,您可以:
master_list = []
with ThreadPoolExecutor(max_workers=20) as exec:
futures = [exec.submit(f, i) for i in range(1, 100)]
exec.shutdown(False) # Optional: workers terminate as soon as all futures finish,
# rather than waiting for all results to be processed
for fut in futures:
try:
master_list.append(fut.result())
except Exception:
... log error here ...
对于更高效的代码,您可以按照完成而不是提交的顺序检索结果,使用concurrent.futures.as_completed
在结束时急切地检索结果。与前一代码的唯一变化是:
for fut in futures:
变为:
for fut in concurrent.futures.as_completed(futures):
其中as_completed
完成yield
完成/取消期货的工作,而不是推迟到之前提交的所有期货完成并得到处理。
有更复杂的选项涉及使用add_done_callback
,因此主要线程根本不涉及明确处理结果,但这通常是不必要的,而且往往令人困惑,所以它&#39最好尽可能避免。