考虑以下C ++程序。我希望调用exit
的第一个线程将终止该程序。当我用g++ -g test.cxx -lpthread
编译它时会发生这种情况。但是,当我链接TCMalloc(g++ -g test.cxx -lpthread -ltcmalloc
)时,它会挂起。的为什么吗
对堆栈帧的检查表明,调用exit
的第一个线程卡在__unregister_atfork
中等待某种引用计数变量达到0.因为它之前获取了互斥锁,所有其他线程陷入僵局。我的猜测是,在tcmalloc的atfork处理程序和我的代码之间存在某种交互。
使用gperftools 2.0在CentOS 6.4上测试。
$ cat test.cxx
#include <unistd.h>
#include <iostream>
#include <pthread.h>
#include <stdlib.h>
using namespace std;
static pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
static void* task(void*) {
if (fork() == 0)
return NULL;
pthread_mutex_lock(&m);
exit(0);
}
int main(int argc, char **argv) {
cout << getpid() << endl;
pthread_t t;
for (unsigned i = 0; i < 100; ++i) {
pthread_create(&t, NULL, task, NULL);
}
sleep(9999);
}
$ g++ -g test.cxx -lpthread && $ ./a.out
19515
$ g++ -g test.cxx -lpthread -ltcmalloc && ./a.out
24252
<<< process hangs indefinitely >>>
^C
$ pstack 24252
Thread 101 (Thread 0x7ffaabdf7700 (LWP 24253)):
#0 0x000000328c4f84c4 in __unregister_atfork () from /lib64/libc.so.6
#1 0x00007ffaac02d2c6 in __do_global_dtors_aux () from /usr/lib64/libtcmalloc.so.4
#2 0x0000000000000000 in ?? ()
Thread 100 (Thread 0x7ffaab3f6700 (LWP 24254)):
#0 0x000000328cc0e054 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x000000328cc09388 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x000000328cc09257 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x0000000000400abf in task(void*) ()
#4 0x000000328cc07851 in start_thread () from /lib64/libpthread.so.0
#5 0x000000328c4e894d in clone () from /lib64/libc.so.6
<<< the other 98 threads are also deadlocked >>>
Thread 1 (Thread 0x7ffaabdf9740 (LWP 24252)):
#0 0x000000328c4acbcd in nanosleep () from /lib64/libc.so.6
#1 0x000000328c4aca40 in sleep () from /lib64/libc.so.6
#2 0x0000000000400b33 in main ()
编辑:我认为问题可能是exit
不是线程安全的。根据{{3}},exit
是线程安全的。但是,POSIX表明exit
不是线程安全的。