我运行10个进程,每个进程有10个线程,并且它们在30秒内经常且经常使用logging.info()
和logging.debug()
写入10个日志文件(每个进程一个)。
通常在10秒后,就会发生死锁。进程停止处理(所有进程)。
gdp python [pid]
与py-bt
和info threads
表示它停留在这里:
Id Target Id Frame
* 1 Thread 0x7ff50f020740 (LWP 1622) "python" 0x00007ff50e8276d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x564f17c8aa80)
at ../sysdeps/unix/sysv/linux/futex-internal.h:205
2 Thread 0x7ff509636700 (LWP 1624) "python" 0x00007ff50eb57bb7 in epoll_wait (epfd=8, events=0x7ff5096351d0, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
3 Thread 0x7ff508e35700 (LWP 1625) "python" 0x00007ff50eb57bb7 in epoll_wait (epfd=12, events=0x7ff508e341d0, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
4 Thread 0x7ff503fff700 (LWP 1667) "python" 0x00007ff50e8276d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x564f17c8aa80)
at ../sysdeps/unix/sysv/linux/futex-internal.h:205
...[threads 5-6 like 4]...
7 Thread 0x7ff5027fc700 (LWP 1690) "python" 0x00007ff50eb46187 in __GI___libc_write (fd=2, buf=0x7ff50967bc24, nbytes=85) at ../sysdeps/unix/sysv/linux/write.c:27
...[threads 8-13 like 4]...
线程7的堆栈:
Traceback (most recent call first):
File "/usr/lib/python2.7/logging/__init__.py", line 889, in emit
stream.write(fs % msg)
...[skipped useless lines]...
这段代码(我猜想记录__init__
函数的代码):
884 #the codecs module, but fail when writing to a
885 #terminal even when the codepage is set to cp1251.
886 #An extra encoding step seems to be needed.
887 stream.write((ufs % msg).encode(stream.encoding))
888 else:
>889 stream.write(fs % msg)
890 except UnicodeError:
891 stream.write(fs % msg.encode("UTF-8"))
892 self.flush()
893 except (KeyboardInterrupt, SystemExit):
894 raise
其余线程的堆栈类似-等待GIL:
Traceback (most recent call first):
Waiting for the GIL
File "/usr/lib/python2.7/threading.py", line 174, in acquire
rc = self.__block.acquire(blocking)
File "/usr/lib/python2.7/logging/__init__.py", line 715, in acquire
self.lock.acquire()
...[skipped useless lines]...
它写道,软件包logging
是多线程的,没有附加的锁。那么为什么包logging
可能会死锁?它会打开太多文件描述符还是限制其他内容?
这就是我初始化它的方式(如果重要的话):
def get_logger(log_level, file_name='', log_name=''):
if len(log_name) != 0:
logger = logging.getLogger(log_name)
else:
logger = logging.getLogger()
logger.setLevel(logger_state[log_level])
formatter = logging.Formatter('%(asctime)s [%(levelname)s][%(name)s:%(funcName)s():%(lineno)s] - %(message)s')
# file handler
if len(file_name) != 0:
fh = logging.FileHandler(file_name)
fh.setLevel(logging.DEBUG)
fh.setFormatter(formatter)
logger.addHandler(fh)
# console handler
console_out = logging.StreamHandler()
console_out.setLevel(logging.DEBUG)
console_out.setFormatter(formatter)
logger.addHandler(console_out)
return logger
答案 0 :(得分:0)
问题是因为我一直在将输出写入控制台和文件中,但是所有这些进程都是通过重定向到管道进行初始化的,从未被监听。
p = Popen(proc_params,
stdout=PIPE,
stderr=STDOUT,
close_fds=ON_POSIX,
bufsize=1
)
因此,在这种情况下,管道似乎有其缓冲区大小限制,并且在填充死锁之后。
在此说明:https://docs.python.org/2/library/subprocess.html
Note
Do not use stdout=PIPE or stderr=PIPE with this function as that can deadlock based on the child process output volume. Use Popen with the communicate() method when you need pipes.
对于我不使用的功能已经完成了,但是对于Popen运行来说似乎是有效的,如果那样的话你不读出管道。