我有一个在私人网络服务器上运行的c#应用程序(servicebus)。它的基本工作是接受一些Web请求并创建其他进程来处理请求中描述的数据包。处理过程经常进行,可能需要数周时间。
有时,servicebus会开始消耗大量的CPU。也就是说,它通常是空闲的,每天获得1或2秒的CPU时间。当它进入这种奇怪的模式时,它一直消耗100 +%的CPU。此时,如果有新的请求进入,则会由apache生成一个新的servicebus实例。此时我将运行两个servicebus副本(可能同时处理处理请求 - 我不知道)
这是正常的过程(通过ps -aef
):
UID PID PPID C STIME TTY TIME CMD
apache 8978 1 0 11:51 ? 00:00:01 /opt/mono/bin/mono /opt/mono/lib/mono/4.0/mod-mono-server4.exe --filename /tmp/mod_mono_server_default --applications /:/opt/ov/vespa/servicebus --nonstop
如您所见,该应用程序是一个C#程序(使用VS 2010 for .NET 4编译),通过单声道下的mod-mono-server4运行。这是一个redhat linux enterprise 6.5系统。
经过一段时间后,这个过程变得疯狂了#39;并开始消耗大量的CPU和mod-mono-server创建了一个新实例。正如你所看到的那样,直到星期一早上用了超过2天的CPU时间之后我才发现它。这是新的ps -aef
输出:
UID PID PPID C STIME TTY TIME CMD
apache 8978 1 83 Sep19 ? 2-08:26:25 /opt/mono/bin/mono /opt/mono/lib/mono/4.0/mod-mono-server4.exe --filename /tmp/mod_mono_server_default --applications /:/opt/ov/vespa/servicebus --nonstop
apache 32538 1 0 Sep21 ? 00:00:00 /opt/mono/bin/mono /opt/mono/lib/mono/4.0/mod-mono-server4.exe --filename /tmp/mod_mono_server_default --applications /:/opt/ov/vespa/servicebus --nonstop
如果您需要查看应用程序的配置方式,我会从应用程序的conf.d文件中获取该片段:
# The user and group need to be set before mod_mono.conf is loaded.
User apache
Group apache
# Service Bus setup
Include /etc/httpd/conf/mod_mono.conf
Listen 8081
<VirtualHost *:8081>
DocumentRoot /opt/ov/vespa/servicebus
MonoServerPath default /opt/mono/bin/mod-mono-server4
MonoApplications "/:/opt/ov/vespa/servicebus"
<Location "/">
SetHandler mono
Allow from all
</Location>
</VirtualHost>
基本问题是......如何调试此问题并查找我的应用程序出了什么问题?然而,这有点模糊。通常情况下,我想将mono设置为debug mod,然后当它进入这种奇怪的模式时,我会使用kill -ABRT
来获取核心转储。我假设我可以找到一个for循环/ while循环/ etc,它会卡住并修复我的错误。那么,真正的问题是如何做到这一点?那个过程PID = 8978实际上我的应用程序是由单声道解释还是单声道运行mod-mono-server4.exe?或者它是单声道解释mod-mono-server4.exe,它反过来解释servicebus?在apache配置文件中,我将参数设置为mono,这样我就可以得到--debug
我想要的。
通常要调试我需要一个类似的过程:
/opt/mono/bin/mono --debug /opt/test/testapp.exe
所以,我需要在命令行中输入--debug
并找出实际杀死的PID。然后我可以使用http://www.mono-project.com/docs/debug+profile/debug/中的技术来调试核心文件。
注意:我已尝试将MonoMaxCPUTime和MonoAutoRestartTime指令放入apache conf文件中以解决此问题。问题是,当一切都是名义上的时候,它们运作良好。一旦它进入这种不良状态(消耗大量CPU),重启就会失败。或者更确切地说,它成功创建了一个新流程,但未能删除旧流程(基本上是我已经处于的状态)。
到目前为止的调试:我看到PID=8979
的日志文件在9月21日03:27停止。鉴于它经常产生200%或300%或更多的CPU,很容易就是“崩溃”的时间。查看apache日志,我发现当时有一个不寻常的事件。日志的转储如下:
...
[Sun Sep 21 03:28:01 2014] [notice] SIGHUP received. Attempting to restart
mod-mono-server received a shutdown message
httpd: Could not reliably determine the server's fully qualified domain name, using localhost.localdomain for ServerName
Stacktrace:
Native stacktrace:
/opt/mono/bin/mono() [0x48cc26]
/lib64/libpthread.so.0() [0x32fca0f710]
/lib64/libpthread.so.0(pthread_cond_wait+0xcc) [0x32fca0b5bc]
/opt/mono/bin/mono() [0x5a6a9c]
/opt/mono/bin/mono() [0x5ad4e9]
/opt/mono/bin/mono() [0x5116d8]
/opt/mono/bin/mono(mono_thread_manage+0x1ad) [0x5161cd]
/opt/mono/bin/mono(mono_main+0x1401) [0x46a671]
/lib64/libc.so.6(__libc_start_main+0xfd) [0x32fc21ed1d]
/opt/mono/bin/mono() [0x4123a9]
Debug info from gdb:
warning: File "/opt/mono/bin/mono-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "/usr/share/gdb/auto-load:/usr/lib/debug:/usr/bin/mono-gdb.py".
To enable execution of this file add
add-auto-load-safe-path /opt/mono/bin/mono-gdb.py
line to your configuration file "$HOME/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "$HOME/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
[New LWP 9148]
[New LWP 9135]
[New LWP 9000]
[New LWP 8991]
[New LWP 8990]
[New LWP 8988]
[New LWP 8987]
[New LWP 8986]
[New LWP 8985]
[New LWP 8984]
[Thread debugging using libthread_db enabled]
0x00000032fca0e75d in read () from /lib64/libpthread.so.0
11 Thread 0x7f0d8bcaf700 (LWP 8984) 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
10 Thread 0x7f0d8b2ae700 (LWP 8985) 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
9 Thread 0x7f0d8a8ad700 (LWP 8986) 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
8 Thread 0x7f0d89eac700 (LWP 8987) 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
7 Thread 0x7f0d894ab700 (LWP 8988) 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
6 Thread 0x7f0d88aaa700 (LWP 8990) 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
5 Thread 0x7f0d880a9700 (LWP 8991) 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
4 Thread 0x7f0d8713c700 (LWP 9000) 0x00000032fca0d930 in sem_wait () from /lib64/libpthread.so.0
3 Thread 0x7f0d86157700 (LWP 9135) 0x00000032fc27a983 in malloc () from /lib64/libc.so.6
2 Thread 0x7f0d8568b700 (LWP 9148) 0x00000032fc2792f0 in _int_malloc () from /lib64/libc.so.6
* 1 Thread 0x7f0d8bcb0740 (LWP 8978) 0x00000032fca0e75d in read () from /lib64/libpthread.so.0
Thread 11 (Thread 0x7f0d8bcaf700 (LWP 8984)):
#0 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000005d59f7 in GC_wait_marker ()
#2 0x00000000005dbabd in GC_help_marker ()
#3 0x00000000005d4778 in GC_mark_thread ()
#4 0x00000032fca079d1 in start_thread () from /lib64/libpthread.so.0
#5 0x00000032fc2e8b5d in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x7f0d8b2ae700 (LWP 8985)):
#0 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000005d59f7 in GC_wait_marker ()
#2 0x00000000005dbabd in GC_help_marker ()
#3 0x00000000005d4778 in GC_mark_thread ()
#4 0x00000032fca079d1 in start_thread () from /lib64/libpthread.so.0
#5 0x00000032fc2e8b5d in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x7f0d8a8ad700 (LWP 8986)):
#0 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000005d59f7 in GC_wait_marker ()
#2 0x00000000005dbabd in GC_help_marker ()
#3 0x00000000005d4778 in GC_mark_thread ()
#4 0x00000032fca079d1 in start_thread () from /lib64/libpthread.so.0
#5 0x00000032fc2e8b5d in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x7f0d89eac700 (LWP 8987)):
#0 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000005d59f7 in GC_wait_marker ()
#2 0x00000000005dbabd in GC_help_marker ()
#3 0x00000000005d4778 in GC_mark_thread ()
#4 0x00000032fca079d1 in start_thread () from /lib64/libpthread.so.0
#5 0x00000032fc2e8b5d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7f0d894ab700 (LWP 8988)):
#0 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000005d59f7 in GC_wait_marker ()
#2 0x00000000005dbabd in GC_help_marker ()
#3 0x00000000005d4778 in GC_mark_thread ()
#4 0x00000032fca079d1 in start_thread () from /lib64/libpthread.so.0
#5 0x00000032fc2e8b5d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7f0d88aaa700 (LWP 8990)):
#0 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000005d59f7 in GC_wait_marker ()
#2 0x00000000005dbabd in GC_help_marker ()
#3 0x00000000005d4778 in GC_mark_thread ()
#4 0x00000032fca079d1 in start_thread () from /lib64/libpthread.so.0
#5 0x00000032fc2e8b5d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7f0d880a9700 (LWP 8991)):
#0 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000005d59f7 in GC_wait_marker ()
#2 0x00000000005dbabd in GC_help_marker ()
#3 0x00000000005d4778 in GC_mark_thread ()
#4 0x00000032fca079d1 in start_thread () from /lib64/libpthread.so.0
#5 0x00000032fc2e8b5d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7f0d8713c700 (LWP 9000)):
#0 0x00000032fca0d930 in sem_wait () from /lib64/libpthread.so.0
#1 0x00000000005bea28 in mono_sem_wait ()
#2 0x000000000053b2bb in finalizer_thread ()
#3 0x000000000051375b in start_wrapper ()
#4 0x00000000005a8214 in thread_start_routine ()
#5 0x00000000005d565a in GC_start_routine ()
#6 0x00000032fca079d1 in start_thread () from /lib64/libpthread.so.0
#7 0x00000032fc2e8b5d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7f0d86157700 (LWP 9135)):
#0 0x00000032fc27a983 in malloc () from /lib64/libc.so.6
#1 0x00000000005cd0e6 in monoeg_malloc ()
#2 0x00000000005cbef1 in monoeg_g_hash_table_insert_replace ()
#3 0x00000000005acff5 in WaitForMultipleObjectsEx ()
#4 0x0000000000512694 in ves_icall_System_Threading_WaitHandle_WaitAny_internal ()
#5 0x00000000417b0270 in ?? ()
#6 0x00007f0d68000c21 in ?? ()
#7 0x00007f0d847c4b40 in ?? ()
#8 0x00007f0d68003e00 in ?? ()
#9 0x000000004023e890 in ?? ()
#10 0x00007f0d68003e00 in ?? ()
#11 0x00007f0d86156940 in ?? ()
#12 0x00007f0d861568a0 in ?? ()
#13 0x00007f0d8767d000 in ?? ()
#14 0xffffffffffffffff in ?? ()
#15 0x00007f0d86156cc0 in ?? ()
#16 0x00007f0d847c4b40 in ?? ()
#17 0x000000004023e268 in ?? ()
#18 0x0000000000000000 in ?? ()
Thread 2 (Thread 0x7f0d8568b700 (LWP 9148)):
#0 0x00000032fc2792f0 in _int_malloc () from /lib64/libc.so.6
#1 0x00000032fc27a636 in calloc () from /lib64/libc.so.6
#2 0x00000000005cd148 in monoeg_malloc0 ()
#3 0x00000000005cbb94 in monoeg_g_hash_table_new ()
#4 0x00000000005acf94 in WaitForMultipleObjectsEx ()
#5 0x0000000000512694 in ves_icall_System_Threading_WaitHandle_WaitAny_internal ()
#6 0x00000000417b0270 in ?? ()
#7 0x00007f0d60000c21 in ?? ()
#8 0x00007f0d8767d000 in ?? ()
#9 0xffffffffffffffff in ?? ()
#10 0x000000004023e890 in ?? ()
#11 0x00007f0d68003e00 in ?? ()
#12 0x00007f0d8568a940 in ?? ()
#13 0x00007f0d8568a8a0 in ?? ()
#14 0x00007f0d8767d000 in ?? ()
#15 0xffffffffffffffff in ?? ()
#16 0x00007f0d8568acc0 in ?? ()
#17 0x00007f0d864e2990 in ?? ()
#18 0x000000004023e268 in ?? ()
#19 0x0000000000000000 in ?? ()
Thread 1 (Thread 0x7f0d8bcb0740 (LWP 8978)):
#0 0x00000032fca0e75d in read () from /lib64/libpthread.so.0
#1 0x000000000048cdb6 in mono_handle_native_sigsegv ()
#2 <signal handler called>
#3 0x00000032fca0b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#4 0x00000000005a6a9c in _wapi_handle_timedwait_signal_handle ()
#5 0x00000000005ad4e9 in WaitForMultipleObjectsEx ()
#6 0x00000000005116d8 in wait_for_tids ()
#7 0x00000000005161cd in mono_thread_manage ()
#8 0x000000000046a671 in mono_main ()
#9 0x00000032fc21ed1d in __libc_start_main () from /lib64/libc.so.6
#10 0x00000000004123a9 in _start ()
=================================================================
Got a SIGABRT while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================
我认为这意味着该过程有一个seg错误,并试图转储核心或某些东西,并试图这样做?或者在处理sig SEGV时它是否获得了信号ABRT?在任何一种情况下,这都是单声道转储,对吧?我找到了完整的文件系统并且没有生成核心,因此我不确定apache / gdb是如何管理它的。
如果重要,我有RedHat 6.5,单声道2.10.8,gcc 4.4.7,mod-mono-server4.exe 2.10.0.0
基本上归结为这些问题。
--debug
放入apache问题的mono命令中? 或者我完全错了,这些问题的答案对我没有帮助吗?
答案 0 :(得分:1)
首先:Mono 2.10已经很老了,您可能遇到已在the latest 3.8修复的错误。
至于将--debug转换为应用程序,可以设置环境变量MONO_OPTIONS=--debug
,这与在命令行中指定它具有相同的效果。