记录从在不同进程中运行的类实例中分离文件

时间:2017-05-03 09:43:34

标签: python logging multiprocessing

问题

在主进程中,我实例化多个并行运行方法的类实例,并且应该记录到自己的日志文件中。在他们完成工作之前和之后,主进程中的一些事件应该记录到另一个文件中。

由于在程序执行期间不能同时访问同一文件,因此我不会使用队列来序列化日志记录事件。我只使用基本记录器,并为每个模块使用一个独立的记录器,它继承自基本记录器。

我现在的问题是并行执行其方法的类实例使用utils模块中的函数。此utils模块中的记录器应该记录到它所使用的类实例的文件,如果它知道记录器的正确名称,它只能这样做。

示例代码

我将实际代码缩减为最小的工作示例,以帮助更好地理解我的问题。在主模块中,我实例化一个名为' Main'只有StreamHandler并且应用程序中的每个其他记录器都从中继承

# Content of main.py

import logging
import multiprocessing
import time

from worker import Worker
from container import Container

logger = logging.getLogger('Main')

def setup_base_logger():
    formatter = logging.Formatter('%(asctime)s - %(name)-14s - %(levelname)8s - %(message)s')
    console_handler = logging.StreamHandler()
    console_handler.setFormatter(formatter)
    logger.addHandler(console_handler)

if __name__ == '__main__':
    multiprocessing.freeze_support()
    setup_base_logger()
    logger.warning('Starting the main program')
    container = Container([Worker(name='Worker_Nr.%d' % i) for i in range(4)])
    container.run()

Container类在container.py中定义,只保存Worker个实例的列表:

# Content of container.py

import logging
import multiprocessing

logger = logging.getLogger('Main.container')

def run_worker(worker):
    worker.run()

class Container:
    def __init__(self, workers):
        self.workers = workers

    def run(self):
        logger.warning('The workers begin to run ...')
        pool = multiprocessing.Pool(processes=4, maxtasksperchild=1)
        pool.map(run_worker, self.workers)
        logger.warning('Workers finished running.')

它的任务是并行执行worker的run()方法。我使用multiprocessing.Pool因为我需要限制使用的处理器数量。 Worker类在模块worker.py中定义:

# Content of worker.py

import logging
import os
import time

import util

def configure_logger(name, logfile):
    logger = logging.getLogger(name)
    formatter = logging.Formatter('%(asctime)s - %(name)-14s - %(levelname)-8s - %(message)s')
    file_handler = logging.FileHandler(logfile, mode='w')
    file_handler.setFormatter(formatter)
    logger.addHandler(file_handler)

class Worker:
    def __init__(self, name):
        self.name = name
        self.run_time = 2
        logger_name = 'Main.worker.' + name
        configure_logger(name=logger_name, logfile=self.name + '.log')
        self.logger = logging.getLogger(logger_name)

    def __getstate__(self):
        d = self.__dict__.copy()
        if 'logger' in d:
            d['logger'] = d['logger'].name
        return d

    def __setstate__(self, d):
        if 'logger' in d:
            d['logger'] = logging.getLogger(d['logger'])
        self.__dict__.update(d)

    def run(self):
        self.logger.warning('{0} is running for {1} seconds with process id {2}'.format(self.name, self.run_time, os.getpid()))
        time.sleep(self.run_time)
        util.print_something(os.getpid())
        self.logger.warning('{} woke up!'.format(self.name))

如果Worker的每个实例都应该有一个日志文件,我认为Worker需要一个记录器实例作为属性。 utils模块如下所示:

# Content of util.py

import logging

logger = logging.getLogger('Main.util')

def print_something(s):
    print(s)
    logger.warning('%s was just printed', s)

执行main.py会给出以下输出:

2017-05-03 11:08:05,738 - Main           -  WARNING - Starting the main program
2017-05-03 11:08:05,740 - Main.container -  WARNING - The workers begin to run ...
Worker_Nr.0 is running for 2 seconds with process id 5532
Worker_Nr.1 is running for 2 seconds with process id 17908
Worker_Nr.2 is running for 2 seconds with process id 19796
Worker_Nr.3 is running for 2 seconds with process id 10804
5532
5532 was just printed
Worker_Nr.0 woke up!
17908
19796
17908 was just printed
19796 was just printed
Worker_Nr.1 woke up!
Worker_Nr.2 woke up!
10804
10804 was just printed
Worker_Nr.3 woke up!
2017-05-03 11:08:07,941 - Main.container -  WARNING - Workers finished running.

如您所见,Worker实例创建的日志记录缺少格式。此外,创建的日志文件没有任何内容。如果在configure_logger()中添加带有Worker.__init__的格式化处理程序,那怎么可能?

我尝试了什么

  • 将记录器名称传递给utils模块中的每个函数。这有效,但似乎过于复杂,因为util.py中有很多函数,并且以这种方式使用了更多模块
  • 有关登录多处理应用程序的类似问题通常希望从不同进程登录到同一文件,我希望每个进程都有一个单独的日志文件

问题

  1. 如何在utils模块(以及可能的其他模块)中创建的日志记录转到正确的日志文件?
  2. Worker实例记录的所有内容都会在没有格式的情况下发送到stdout,并且不会将任何内容写入日志文件(但会创建它们)。为什么?
  3. 我在Windows 7 64位上使用Python 3.5.1。

    如果您认为在主进程中使用Queue和日志记录线程要容易得多,那就完全可以接受了。我唯一关心的是日志的顺序。我想我之后可以对它们进行排序,正如其他几篇文章中所建议的那样。

    我在我的智慧'结束,任何帮助或正确方向的提示都非常感谢!

2 个答案:

答案 0 :(得分:0)

你必须重复

configure_logger(name=logger_name, logfile=self.name + '.log')
每个过程

def run(self):
    configure_logger(name=logger_name, logfile=self.name + '.log')
    ...

答案 1 :(得分:0)

通过这个最小的例子,我能够重现原始错误,该错误促使您修改Worker类,以便可以对其进行腌制:

import logging
import multiprocessing
import time

def configure_logger(name, logfile):
    logger = logging.getLogger(name)
    formatter = logging.Formatter('%(asctime)s - %(name)-14s - %(levelname)-8s - %(message)s')
    file_handler = logging.FileHandler(logfile, mode='w')
    file_handler.setFormatter(formatter)
    logger.addHandler(file_handler)
    logger.setLevel(logging.DEBUG)

class Worker:
    def __init__(self, number):
        self.name = "worker%d" % number
        self.log_file = "%s.log" % self.name
        configure_logger(self.name, self.log_file)
        self.logger = logging.getLogger(self.name)

    def run(self):
        self.logger.info("%s is running...", self.name)
        time.sleep(1.0)
        self.logger.info("%s is exiting...", self.name)

def run_worker(worker):
    worker.run()

N = 4
workers = [Worker(n) for n in range(N)]
pool = multiprocessing.Pool(processes=N, maxtasksperchild=1)
pool.map(run_worker, workers)

这是运行此程序的异常回溯:

Traceback (most recent call last):
  File "custom.py", line 31, in <module>
    pool.map(run_worker, workers)
  File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
TypeError: can't pickle thread.lock objects

解决方案不是要更改Worker类被腌制的方式,而是要在logging.getLogger方法中调用run

class Worker:
    def __init__(self, number):
        self.name = "worker%d" % number
        self.log_file = "%s.log" % self.name
        configure_logger(self.name, self.log_file)

    def run(self):
        self.logger = logging.getLogger(self.name)
        self.logger.info("%s is running...", self.name)
        time.sleep(1.0)
        self.logger.info("%s is exiting...", self.name)

通过此更改,程序将运行,并生成预期的日志文件。