在python中记录多线程死锁

时间:2019-02-19 12:34:12

标签: multithreading python-2.7 logging

我运行10个进程,每个进程有10个线程,并且它们在30秒内经常且经常使用logging.info()logging.debug()写入10个日志文件(每个进程一个)。

通常在10秒后,就会发生死锁。进程停止处理(所有进程)。

gdp python [pid]py-btinfo threads表示它停留在这里:

  Id   Target Id                                 Frame 
* 1    Thread 0x7ff50f020740 (LWP 1622) "python" 0x00007ff50e8276d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x564f17c8aa80)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  2    Thread 0x7ff509636700 (LWP 1624) "python" 0x00007ff50eb57bb7 in epoll_wait (epfd=8, events=0x7ff5096351d0, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  3    Thread 0x7ff508e35700 (LWP 1625) "python" 0x00007ff50eb57bb7 in epoll_wait (epfd=12, events=0x7ff508e341d0, maxevents=256, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  4    Thread 0x7ff503fff700 (LWP 1667) "python" 0x00007ff50e8276d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x564f17c8aa80)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:205
...[threads 5-6 like 4]...
  7    Thread 0x7ff5027fc700 (LWP 1690) "python" 0x00007ff50eb46187 in __GI___libc_write (fd=2, buf=0x7ff50967bc24, nbytes=85) at ../sysdeps/unix/sysv/linux/write.c:27
...[threads 8-13 like 4]...

线程7的堆栈:

Traceback (most recent call first):
  File "/usr/lib/python2.7/logging/__init__.py", line 889, in emit
    stream.write(fs % msg)
...[skipped useless lines]...

这段代码(我猜想记录__init__函数的代码):

 884                                #the codecs module, but fail when writing to a
 885                                #terminal even when the codepage is set to cp1251.
 886                                #An extra encoding step seems to be needed.
 887                                stream.write((ufs % msg).encode(stream.encoding))
 888                        else:
>889                            stream.write(fs % msg)
 890                    except UnicodeError:
 891                        stream.write(fs % msg.encode("UTF-8"))
 892                self.flush()
 893            except (KeyboardInterrupt, SystemExit):
 894                raise

其余线程的堆栈类似-等待GIL:

Traceback (most recent call first):
  Waiting for the GIL
  File "/usr/lib/python2.7/threading.py", line 174, in acquire
    rc = self.__block.acquire(blocking)
  File "/usr/lib/python2.7/logging/__init__.py", line 715, in acquire
    self.lock.acquire()
...[skipped useless lines]...

它写道,软件包logging是多线程的,没有附加的锁。那么为什么包logging可能会死锁?它会打开太多文件描述符还是限制其他内容?

这就是我初始化它的方式(如果重要的话):

def get_logger(log_level, file_name='', log_name=''):
    if len(log_name) != 0:
        logger = logging.getLogger(log_name)
    else:
        logger = logging.getLogger()
    logger.setLevel(logger_state[log_level])
    formatter = logging.Formatter('%(asctime)s [%(levelname)s][%(name)s:%(funcName)s():%(lineno)s] - %(message)s')

    # file handler
    if len(file_name) != 0:
        fh = logging.FileHandler(file_name)
        fh.setLevel(logging.DEBUG)
        fh.setFormatter(formatter)
        logger.addHandler(fh)

    # console handler
    console_out = logging.StreamHandler()
    console_out.setLevel(logging.DEBUG)
    console_out.setFormatter(formatter)
    logger.addHandler(console_out)
    return logger

1 个答案:

答案 0 :(得分:0)

问题是因为我一直在将输出写入控制台和文件中,但是所有这些进程都是通过重定向到管道进行初始化的,从未被监听。

            p = Popen(proc_params,
                      stdout=PIPE,
                      stderr=STDOUT,
                      close_fds=ON_POSIX,
                      bufsize=1
                      )

因此,在这种情况下,管道似乎有其缓冲区大小限制,并且在填充死锁之后。

在此说明:https://docs.python.org/2/library/subprocess.html

Note

Do not use stdout=PIPE or stderr=PIPE with this function as that can deadlock based on the child process output volume. Use Popen with the communicate() method when you need pipes. 

对于我不使用的功能已经完成了,但是对于Popen运行来说似乎是有效的,如果那样的话你不读出管道。