我有一个python守护程序进程。每次调用命令时,此守护进程都会产生一个线程,以满足对我的守护进程的调用。 从线程内部,我使用Python子进程popen来执行像
这样的shell命令 def __executeCommand(self, cmd):
try:
self.__assertEmptyCommand(cmd)
logger.debug('Executing command : '+str(cmd))
#kept for use after analysis
#cmd = cmd.split(' ')
#proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False)
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
(output, error) = proc.communicate()
returnCode = proc.returncode
logger.debug('Command execution finished: Cmd:'+ str(cmd) + "\nReturn code:" \
+ str(returnCode) + "\nOutput:" + str(output) + "\nError:" + str(error))
if output is '':
output = []
returnOutput = output
if(output != [] and output[-1] == '\n'):
returnOutput = output[:-1]
return (returnCode, str(returnOutput), str(error))
在执行某些命令期间,proc.communicate()永远不会返回。 当我检查父守护进程的strace执行哪个线程时 以下是strace输出
### Looking at strace for PID -> 14879 -> Main Daemon process ###
[root@mymach ~]# strace -p 14879
Process 14879 attached - interrupt to quit
select(0, NULL, NULL, NULL, {1, 122000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {2, 0}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {2, 0}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {2, 0}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {2, 0}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {2, 0}) = 0 (Timeout)
查看在popen下执行shell的子PID的内容。
### Looking at strace for PID -> 24294 -> Child Process process ###
[root@mymach ~]# strace -p 24294
Process 24294 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 0
nanosleep({0, 2000001}, NULL) = 0
nanosleep({0, 2000001}, NULL) = 0
nanosleep({0, 2000001}, NULL) = 0
nanosleep({0, 2000001}, NULL) = 0
nanosleep({0, 2000001}, NULL) = 0
nanosleep({0, 2000001}, NULL) = 0
我将gdb附加到正在运行的守护进程,我看到了执行popen代码的线程的以下堆栈跟踪
### attaching GDB to main Daemon Process ###
[root@mymach ~]# gdb attach 14879
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-42.el5.HYDRA)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
--snip--
(gdb) info thread
11 Thread 0x406c4940 (LWP 15072) 0x00000035ee00e291 in nanosleep () from /lib64/libpthread.so.0
10 Thread 0x410c5940 (LWP 15073) 0x00000035ee00b1c0 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
9 Thread 0x41ac6940 (LWP 15075) 0x00000035ee00b1c0 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
8 Thread 0x424c7940 (LWP 15078) 0x00000035ee00e291 in nanosleep () from /lib64/libpthread.so.0
7 Thread 0x42ec8940 (LWP 15081) 0x00000035ee00b1c0 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
6 Thread 0x438c9940 (LWP 15084) 0x00000035ee00cd91 in sem_wait () from /lib64/libpthread.so.0
5 Thread 0x442ca940 (LWP 15085) 0x00000035ed4cc3f2 in select () from /lib64/libc.so.6
4 Thread 0x44ccb940 (LWP 15088) 0x00000035ee00dc0b in accept () from /lib64/libpthread.so.0
3 Thread 0x49cd3940 (LWP 24090) 0x00000035ee00cd91 in sem_wait () from /lib64/libpthread.so.0
2 Thread 0x47ed0940 (LWP 24281) 0x00000035ee00d9eb in read () from /lib64/libpthread.so.0 -----------------> Thread common for PID -> 24294
* 1 Thread 0x2b34ca2d7610 (LWP 14879) 0x00000035ed4cc3f2 in select () from /lib64/libc.so.6
(gdb) thread 2
[Switching to thread 2 (Thread 0x47ed0940 (LWP 24281))]#0 0x00000035ee00d9eb in read ()
from /lib64/libpthread.so.0
(gdb) bt
#0 0x00000035ee00d9eb in read () from /lib64/libpthread.so.0
#1 0x00000035ee8bfc41 in read (self=<value optimized out>, args=<value optimized out>)
from /usr/lib64/libpython2.4.so.1.0
--snip--
#20 0x00000035ee895ad8 in call_function (f=0x2c00013339e0) at Python/ceval.c:3656
#21 PyEval_EvalFrame (f=0x2c00013339e0) at Python/ceval.c:2163
#22 0x00000035ee895c8b in call_function (f=0x2c0002bb9a20) at Python/ceval.c:3645
#23 PyEval_EvalFrame (f=0x2c0002bb9a20) at Python/ceval.c:2163
#24 0x00000035ee895c8b in call_function (f=0x2c0002136020) at Python/ceval.c:3645
---Type <return> to continue, or q <return> to quit---q
将GDB附加到如此创建的子进程,后跟踪如下
[root@mymach ~]# gdb attach 24294
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-42.el5.HYDRA)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
--snip--
(gdb) bt
#0 0x00000035ee00e291 in nanosleep () from /lib64/libpthread.so.0
#1 0x00002b34ca068258 in SpinLock::SlowLock (this=0x2b34ca298440)
at src/allocator/base/spinlock.cc:104
#2 0x00002b34ca061526 in Lock (this=0x2b34ca298440, start=0x47ecd330, end=0x47ecd328,
N=<value optimized out>) at src/allocator/base/spinlock.h:90
#3 tcmalloc::CentralFreeList::RemoveRange (this=0x2b34ca298440, start=0x47ecd330, end=0x47ecd328,
N=<value optimized out>) at src/allocator/central_freelist.cc:219
#4 0x00002b34ca059ffe in tcmalloc::ThreadCache<false>::FetchFromCentralCache (
this=0x2c00000305c0, cl=<value optimized out>, byte_size=48)
at src/allocator/thread_cache.cc:159
#5 0x00002b34ca05e420 in Allocate (this=0x2c0000030580, size=<value optimized out>)
at src/allocator/thread_cache.h:331
#6 allocateWithSizeUpdate (this=0x2c0000030580, size=<value optimized out>)
at src/allocator/tcmalloc_heap.h:110
#7 allocate (this=0x2c0000030580, size=<value optimized out>) at src/allocator/stats_heap.h:77
#8 tcmalloc::Heapifier<tcmalloc::StatsHeap<false> >::allocate (this=0x2c0000030580,
size=<value optimized out>) at src/allocator/heap.h:107
#9 0x00002b34ca0794f2 in unlimited_cpp_alloc (old_ptr=0x0, new_size=<value optimized out>)
at src/allocator/tcmalloc.cc:850
--snip--
我不确定为什么子进程在FD 0(stdin)上等待SpinLock以及等待FD 0(stdin)上的read()响应的主守护进程。 以上并非始终存在,但在某些时刻和其他时刻出现就好了!
对此的任何帮助都非常感激。
答案 0 :(得分:0)
问题是因为我们使用的是自定义分配器,由于subprocess.popen中的内存分配而被阻止。
问题需要忽略。