在主进程中,我实例化多个并行运行方法的类实例,并且应该记录到自己的日志文件中。在他们完成工作之前和之后,主进程中的一些事件应该记录到另一个文件中。
由于在程序执行期间不能同时访问同一文件,因此我不会使用队列来序列化日志记录事件。我只使用基本记录器,并为每个模块使用一个独立的记录器,它继承自基本记录器。
我现在的问题是并行执行其方法的类实例使用utils模块中的函数。此utils模块中的记录器应该记录到它所使用的类实例的文件,如果它知道记录器的正确名称,它只能这样做。
我将实际代码缩减为最小的工作示例,以帮助更好地理解我的问题。在主模块中,我实例化一个名为' Main'只有StreamHandler
并且应用程序中的每个其他记录器都从中继承
# Content of main.py
import logging
import multiprocessing
import time
from worker import Worker
from container import Container
logger = logging.getLogger('Main')
def setup_base_logger():
formatter = logging.Formatter('%(asctime)s - %(name)-14s - %(levelname)8s - %(message)s')
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
logger.addHandler(console_handler)
if __name__ == '__main__':
multiprocessing.freeze_support()
setup_base_logger()
logger.warning('Starting the main program')
container = Container([Worker(name='Worker_Nr.%d' % i) for i in range(4)])
container.run()
Container
类在container.py中定义,只保存Worker
个实例的列表:
# Content of container.py
import logging
import multiprocessing
logger = logging.getLogger('Main.container')
def run_worker(worker):
worker.run()
class Container:
def __init__(self, workers):
self.workers = workers
def run(self):
logger.warning('The workers begin to run ...')
pool = multiprocessing.Pool(processes=4, maxtasksperchild=1)
pool.map(run_worker, self.workers)
logger.warning('Workers finished running.')
它的任务是并行执行worker的run()
方法。我使用multiprocessing.Pool
因为我需要限制使用的处理器数量。 Worker
类在模块worker.py中定义:
# Content of worker.py
import logging
import os
import time
import util
def configure_logger(name, logfile):
logger = logging.getLogger(name)
formatter = logging.Formatter('%(asctime)s - %(name)-14s - %(levelname)-8s - %(message)s')
file_handler = logging.FileHandler(logfile, mode='w')
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
class Worker:
def __init__(self, name):
self.name = name
self.run_time = 2
logger_name = 'Main.worker.' + name
configure_logger(name=logger_name, logfile=self.name + '.log')
self.logger = logging.getLogger(logger_name)
def __getstate__(self):
d = self.__dict__.copy()
if 'logger' in d:
d['logger'] = d['logger'].name
return d
def __setstate__(self, d):
if 'logger' in d:
d['logger'] = logging.getLogger(d['logger'])
self.__dict__.update(d)
def run(self):
self.logger.warning('{0} is running for {1} seconds with process id {2}'.format(self.name, self.run_time, os.getpid()))
time.sleep(self.run_time)
util.print_something(os.getpid())
self.logger.warning('{} woke up!'.format(self.name))
如果Worker
的每个实例都应该有一个日志文件,我认为Worker
需要一个记录器实例作为属性。 utils模块如下所示:
# Content of util.py
import logging
logger = logging.getLogger('Main.util')
def print_something(s):
print(s)
logger.warning('%s was just printed', s)
执行main.py会给出以下输出:
2017-05-03 11:08:05,738 - Main - WARNING - Starting the main program
2017-05-03 11:08:05,740 - Main.container - WARNING - The workers begin to run ...
Worker_Nr.0 is running for 2 seconds with process id 5532
Worker_Nr.1 is running for 2 seconds with process id 17908
Worker_Nr.2 is running for 2 seconds with process id 19796
Worker_Nr.3 is running for 2 seconds with process id 10804
5532
5532 was just printed
Worker_Nr.0 woke up!
17908
19796
17908 was just printed
19796 was just printed
Worker_Nr.1 woke up!
Worker_Nr.2 woke up!
10804
10804 was just printed
Worker_Nr.3 woke up!
2017-05-03 11:08:07,941 - Main.container - WARNING - Workers finished running.
如您所见,Worker
实例创建的日志记录缺少格式。此外,创建的日志文件没有任何内容。如果在configure_logger()
中添加带有Worker.__init__
的格式化处理程序,那怎么可能?
我尝试了什么
Worker
实例记录的所有内容都会在没有格式的情况下发送到stdout,并且不会将任何内容写入日志文件(但会创建它们)。为什么?我在Windows 7 64位上使用Python 3.5.1。
如果您认为在主进程中使用Queue
和日志记录线程要容易得多,那就完全可以接受了。我唯一关心的是日志的顺序。我想我之后可以对它们进行排序,正如其他几篇文章中所建议的那样。
我在我的智慧'结束,任何帮助或正确方向的提示都非常感谢!
答案 0 :(得分:0)
你必须重复
configure_logger(name=logger_name, logfile=self.name + '.log')
每个过程
def run(self):
configure_logger(name=logger_name, logfile=self.name + '.log')
...
答案 1 :(得分:0)
通过这个最小的例子,我能够重现原始错误,该错误促使您修改Worker
类,以便可以对其进行腌制:
import logging
import multiprocessing
import time
def configure_logger(name, logfile):
logger = logging.getLogger(name)
formatter = logging.Formatter('%(asctime)s - %(name)-14s - %(levelname)-8s - %(message)s')
file_handler = logging.FileHandler(logfile, mode='w')
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
logger.setLevel(logging.DEBUG)
class Worker:
def __init__(self, number):
self.name = "worker%d" % number
self.log_file = "%s.log" % self.name
configure_logger(self.name, self.log_file)
self.logger = logging.getLogger(self.name)
def run(self):
self.logger.info("%s is running...", self.name)
time.sleep(1.0)
self.logger.info("%s is exiting...", self.name)
def run_worker(worker):
worker.run()
N = 4
workers = [Worker(n) for n in range(N)]
pool = multiprocessing.Pool(processes=N, maxtasksperchild=1)
pool.map(run_worker, workers)
这是运行此程序的异常回溯:
Traceback (most recent call last):
File "custom.py", line 31, in <module>
pool.map(run_worker, workers)
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 567, in get
raise self._value
TypeError: can't pickle thread.lock objects
解决方案不是要更改Worker
类被腌制的方式,而是要在logging.getLogger
方法中调用run
:
class Worker:
def __init__(self, number):
self.name = "worker%d" % number
self.log_file = "%s.log" % self.name
configure_logger(self.name, self.log_file)
def run(self):
self.logger = logging.getLogger(self.name)
self.logger.info("%s is running...", self.name)
time.sleep(1.0)
self.logger.info("%s is exiting...", self.name)
通过此更改,程序将运行,并生成预期的日志文件。