multiprocessing.Pool - PicklingError:无法pickle <type'strread.lock'=“”>:属性查找thread.lock失败</type>

时间:2011-10-23 09:51:29

标签: python threadpool multiprocessing pickle

multiprocessing.Pool让我发疯... 我想要升级许多软件包,对于每一个软件包,我都要检查是否有更大的版本。这是由check_one函数完成的 主要代码在Updater.update方法中:我创建了Pool对象并调用map()方法。

以下是代码:

def check_one(args):
    res, total, package, version = args
    i = res.qsize()
    logger.info('\r[{0:.1%} - {1}, {2} / {3}]',
        i / float(total), package, i, total, addn=False)
    try:
        json = PyPIJson(package).retrieve()
        new_version = Version(json['info']['version'])
    except Exception as e:
        logger.error('Error: Failed to fetch data for {0} ({1})', package, e)
        return
    if new_version > version:
        res.put_nowait((package, version, new_version, json))

class Updater(FileManager):

    # __init__ and other methods...

    def update(self):    
        logger.info('Searching for updates')
        packages = Queue.Queue()
        data = ((packages, self.set_len, dist.project_name, Version(dist.version)) \
            for dist in self.working_set)
        pool = multiprocessing.Pool()
        pool.map(check_one, data)
        pool.close()
        pool.join()
        while True:
            try:
                package, version, new_version, json = packages.get_nowait()
            except Queue.Empty:
                break
            txt = 'A new release is avaiable for {0}: {1!s} (old {2}), update'.format(package,
                                                                                      new_version,
                                                                                      version)
            u = logger.ask(txt, bool=('upgrade version', 'keep working version'), dont_ask=self.yes)
            if u:
                self.upgrade(package, json, new_version)
            else:
                logger.info('{0} has not been upgraded', package)
        self._clean()
        logger.success('Updating finished successfully')

当我运行它时,我得到了这个奇怪的错误:

Searching for updates
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 505, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/local/lib/python2.7/dist-packages/multiprocessing/pool.py", line 225, in _handle_tasks
    put(task)
PicklingError: Can't pickle <type 'thread.lock'>: attribute lookup thread.lock failed

3 个答案:

答案 0 :(得分:27)

多处理通过check_one将任务(包括dataQueue.Queue)传递给工作进程。放入Queue.Queue的所有内容都必须是可选的。 Queue本身不可摘:

import multiprocessing as mp
import Queue

def foo(queue):
    pass

pool=mp.Pool()
q=Queue.Queue()

pool.map(foo,(q,))

产生此异常:

UnpickleableError: Cannot pickle <type 'thread.lock'> objects

您的data包含packages,这是一个Queue.Queue。这可能是问题的根源。


以下是可能的解决方法:Queue用于两个目的:

  1. 找出大致尺寸(通过调用qsize
  2. 存储结果以供日后检索。
  3. 我们可以使用qsize而不是调用mp.Value来共享多个进程之间的值。

    我们可以(而且应该)从调用check_one返回值,而不是将结果存储在队列中。 pool.map将结果收集到自己创建的队列中,并将结果作为pool.map的返回值返回。

    例如:

    import multiprocessing as mp
    import Queue
    import random
    import logging
    
    # logger=mp.log_to_stderr(logging.DEBUG)
    logger = logging.getLogger(__name__)
    
    
    qsize = mp.Value('i', 1)
    def check_one(args):
        total, package, version = args
        i = qsize.value
        logger.info('\r[{0:.1%} - {1}, {2} / {3}]'.format(
            i / float(total), package, i, total))
        new_version = random.randrange(0,100)
        qsize.value += 1
        if new_version > version:
            return (package, version, new_version, None)
        else:
            return None
    
    def update():    
        logger.info('Searching for updates')
        set_len=10
        data = ( (set_len, 'project-{0}'.format(i), random.randrange(0,100))
                 for i in range(set_len) )
        pool = mp.Pool()
        results = pool.map(check_one, data)
        pool.close()
        pool.join()
        for result in results:
            if result is None: continue
            package, version, new_version, json = result
            txt = 'A new release is avaiable for {0}: {1!s} (old {2}), update'.format(
                package, new_version, version)
            logger.info(txt)
        logger.info('Updating finished successfully')
    
    if __name__=='__main__':
        logging.basicConfig(level=logging.DEBUG)
        update()
    

答案 1 :(得分:5)

经过大量挖掘类似问题......

事实证明,碰巧包含threading.Condition()对象的任何对象永远不会使用multiprocessing.Pool。

这是一个例子

import multiprocessing as mp
import threading

class MyClass(object):
   def __init__(self):
      self.cond = threading.Condition()

def foo(mc):
   pass

pool=mp.Pool()
mc=MyClass()
pool.map(foo,(mc,))

我用Python 2.7.5运行它并遇到同样的错误:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
  File "/usr/lib64/python2.7/threading.py", line 764, in run
self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 342, in _handle_tasks
put(task)
PicklingError: Can't pickle <type 'thread.lock'>: attribute lookup thread.lock failed

然后在python 3.4.1上运行它,这个问题已得到解决。

虽然我还没有找到任何有用的解决方法,对于我们这些仍然在2.7.x的人。

答案 2 :(得分:1)

我在docker上遇到python 3.6版本的问题。将版本更改为3.7.3,此问题已解决。