Python multiprocessing.dummy线程池使用`map`可以运行更多任务,而无需初始化线程

时间:2018-09-25 17:58:36

标签: python multithreading threadpool python-multithreading

我有以下代码。我认为初始化程序是在任务执行之前运行的,但是显然我收到错误消息,表明某些任务在未初始化该线程的情况下运行。

import threading
import random
from multiprocessing.dummy import Pool, Value, Queue, Manager

def init_worker():
    global thread_local
    thread_local = threading.local()
    thread_local.worker_idx = random.randint(0, 10)
    print("++++++++++++++++++++++++ worker %s" %  thread_local.worker_idx)


def run(idx):
    print(dir(thread_local))
    worker_idx = thread_local.worker_idx
    print("==================== TASK ID %s by worker %s ====================" % (idx, worker_idx))


pool = Pool(2, init_worker)
pool.map(run, range(10), chunksize=1)

输出:

++++++++++++++++++++++++ worker 1
++++++++++++++++++++++++ worker 7
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'worker_idx']
==================== TASK ID 0 by worker 7 ====================
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'worker_idx']
==================== TASK ID 2 by worker 7 ====================
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'worker_idx']
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
==================== TASK ID 3 by worker 7 ====================
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'worker_idx']
==================== TASK ID 4 by worker 7 ====================
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'worker_idx']
==================== TASK ID 5 by worker 7 ====================
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'worker_idx']
==================== TASK ID 7 by worker 7 ====================
Traceback (most recent call last):
  File "test.py", line 19, in <module>
    pool.map(run, range(10), chunksize=1)
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "test.py", line 14, in run
    worker_idx = thread_local.worker_idx
AttributeError: '_thread._local' object has no attribute 'worker_idx'

因此,似乎两个线程都已正确初始化,但是在没有事先运行初始化程序的情况下启动了更多任务。 print(dir(thread_local))的输出非常不一致。

1 个答案:

答案 0 :(得分:0)

看起来问题出在初始化器上。请注意,即使很明显一个线程创建了一个 global thread本地对象并分配给TASK ID ... by worker 1属性,也没有worker_idx打印输出。这是因为两个线程都尝试创建本地线程 global ,而线程7覆盖了由线程1创建的线程本地对象(而不是worker_idx attr) ,因此破坏了线程1的worker_idx。相反,请尝试在主线程(调用map的线程)中创建全局变量。并且仅在线程初始值设定项中分配worker_idx