Question

我正在尝试启动6个线程，每个线程从列表文件中取出一个项目，将其删除，然后打印该值。

from multiprocessing import Pool

files = ['a','b','c','d','e','f']

def convert(file):
    process_file = files.pop()
    print process_file

if __name__ == '__main__':

    pool = Pool(processes=6)
    pool.map(convert,range(6))

预期输出应为：

a
b
c
d
e
f

相反，输出是：

f
f
f
f
f
f

发生了什么事？提前谢谢。

Answer 1

部分问题在于您没有处理池的多进程特性（请注意，在Python中，MultiThreading因全局解释器锁而无法获得性能）。

您是否有必要更改原始列表？您当前的代码不使用传入的iterable，而是编辑共享的可变对象，这在并发的世界中是危险的。一个简单的解决方案如下：

from multiprocessing import Pool

files = ['a','b','c','d','e','f']

def convert(aFile):
    print aFile

if __name__ == '__main__':

    pool = Pool() #note the default will use the optimal number of workers
    pool.map(convert,files)

你的问题确实让我思考，所以我做了一些探索，以了解Python为何以这种方式行事。似乎Python正在做一些有趣的黑魔法和深度复制（同时保持id，这是非标准的）对象进入新进程。这可以通过改变使用的数量或过程来看出：

from multiprocessing import Pool

files = ['d','e','f','a','b','c',]

a = sorted(files)
def convert(_):
    print a == files
    files.sort()
    #print id(files) #note this is the same for every process, which is interesting

if __name__ == '__main__':

    pool = Pool(processes=1) #
    pool.map(convert,range(6))

==＆GT;除了第一次调用外，所有内容都按预期打印为“True”。

如果将数字或进程设置为2，则其确定性较低，因为它取决于哪个进程首先实际执行其语句。

Answer 2

一种解决方案是使用multiprocessing.dummy，它使用线程而不是进程只需将导入更改为：

from multiprocessing.dummy import Pool

“解决”问题，但不能保护共享内存免遭并发访问。您仍应使用put和get的{{3}}或threading.Lock

使用Python pool.map让多个进程对列表执行操作

2 个答案: