Question

我遇到了从以下脚本收集日志的问题。一旦我将SLEEP_TIME设置为＆＃34; small＆＃34;值，LoggingThread 线程以某种方式阻止日志记录模块。该脚本冻结了日志记录请求在action函数中。如果SLEEP_TIME大约为0.1，脚本将收集我期待的所有日志消息。

我试图关注this answer，但这并没有解决我的问题。

import multiprocessing
import threading
import logging
import time

SLEEP_TIME = 0.000001

logger = logging.getLogger()

ch = logging.StreamHandler()
ch.setFormatter(logging.Formatter('%(asctime)s %(levelname)s %(funcName)s(): %(message)s'))
ch.setLevel(logging.DEBUG)

logger.setLevel(logging.DEBUG)
logger.addHandler(ch)


class LoggingThread(threading.Thread):

    def __init__(self):
        threading.Thread.__init__(self)

    def run(self):
        while True:
            logger.debug('LoggingThread: {}'.format(self))
            time.sleep(SLEEP_TIME)


def action(i):
    logger.debug('action: {}'.format(i))


def do_parallel_job():

    processes = multiprocessing.cpu_count()
    pool = multiprocessing.Pool(processes=processes)
    for i in range(20):
        pool.apply_async(action, args=(i,))
    pool.close()
    pool.join()



if __name__ == '__main__':

    logger.debug('START')

    #
    # multithread part
    #
    for _ in range(10):
        lt = LoggingThread()
        lt.setDaemon(True)
        lt.start()

    #
    # multiprocess part
    #
    do_parallel_job()

    logger.debug('FINISH')

如何在多进程和多线程脚本中使用日志记录模块？

Answer 1

这可能是bug 6721。

在任何有锁，线程和叉子的情况下，这个问题很常见。如果线程1在线程2调用fork时有一个锁定，则在分叉进程中，只有线程2，并且锁定将永久保存。在您的情况下，即logging.StreamHandler.lock。

logging模块可以找到here的修复程序。请注意，您还需要处理任何其他锁。

Answer 2

我最近在将日志记录模块与Pathos多处理库一起使用时遇到了类似的问题。仍然不是100％肯定，但似乎在我看来，问题可能是由以下事实引起的：日志记录处理程序正在尝试从不同进程中重用锁定对象。

能够使用一个简单的包装程序来解决默认日志记录处理程序：

import threading
from collections import defaultdict
from multiprocessing import current_process

import colorlog


class ProcessSafeHandler(colorlog.StreamHandler):
    def __init__(self):
        super().__init__()

        self._locks = defaultdict(lambda: threading.RLock())

    def acquire(self):
        current_process_id = current_process().pid
        self._locks[current_process_id].acquire()

    def release(self):
        current_process_id = current_process().pid
        self._locks[current_process_id].release()

记录多进程/多线程python脚本的死锁

2 个答案: