有没有办法配置为特定进程放置核心转储文件的目录?
我有一个用C ++编写的守护进程,我想为其配置核心转储目录。可选地,文件名模式也应该是可配置的。
我知道/proc/sys/kernel/core_pattern
,但这会改变模式和目录结构全局。
Apache有指令CoreDumpDirectory
- 所以它似乎是可能的。
答案 0 :(得分:14)
不,您无法按流程设置它。如果模式包含目录,核心文件将被转储到进程的当前工作目录或/ proc / sys / kernel / core_pattern中设置的目录。
apache中的CoreDumpDirectory是一个hack,apache为所有导致核心转储的信号注册信号处理程序,并更改其信号处理程序中的当前目录。
/* handle all varieties of core dumping signals */
static void sig_coredump(int sig)
{
apr_filepath_set(ap_coredump_dir, pconf);
apr_signal(sig, SIG_DFL);
#if AP_ENABLE_EXCEPTION_HOOK
run_fatal_exception_hook(sig);
#endif
/* linuxthreads issue calling getpid() here:
* This comparison won't match if the crashing thread is
* some module's thread that runs in the parent process.
* The fallout, which is limited to linuxthreads:
* The special log message won't be written when such a
* thread in the parent causes the parent to crash.
*/
if (getpid() == parent_pid) {
ap_log_error(APLOG_MARK, APLOG_NOTICE,
0, ap_server_conf,
"seg fault or similar nasty error detected "
"in the parent process");
/* XXX we can probably add some rudimentary cleanup code here,
* like getting rid of the pid file. If any additional bad stuff
* happens, we are protected from recursive errors taking down the
* system since this function is no longer the signal handler GLA
*/
}
kill(getpid(), sig);
/* At this point we've got sig blocked, because we're still inside
* the signal handler. When we leave the signal handler it will
* be unblocked, and we'll take the signal... and coredump or whatever
* is appropriate for this particular Unix. In addition the parent
* will see the real signal we received -- whereas if we called
* abort() here, the parent would only see SIGABRT.
*/
}
答案 1 :(得分:0)
可以使用core_pattern文件的“ | command”机制来实现。执行的命令可以根据需要创建目录和文件。可以在参数中通过以下说明符传递命令(参见man 5 core):
%% 是单个%字符
%c 崩溃过程的核心文件大小软资源限制
%d 转储模式-与prctl(2)PR_GET_DUMPABLE
返回的值相同 %e 可执行文件名(无路径前缀)
%E 可执行文件的路径名,用斜杠('/')替换为感叹号('!')
%g (数字)转储进程的实际GID
%h 主机名(与uname(2)返回的节点名相同)
%i 触发核心转储的线程的TID,如线程所在的PID名称空间中所示
%I 触发核心转储的线程的TID,如初始PID名称空间中所示
%p 转储进程的PID,如该进程所在的PID名称空间中所示
%P 转储进程的PID,如初始PID名称空间中所示
%s 引起转储的信号数
%t 的转储时间,表示为自1970年1月1日00:00:00 +0000(UTC)起的秒数
%u (数字)转储进程的实际UID
例如,可以如下创建脚本(例如名为crash.sh):
#!/bin/bash
# $1: process number on host side (%P)
# $2: program's name (%e)
OUTDIR=/tmp/core/$2
OUTFILE="core_$1"
# Create a sub-directory in /tmp
mkdir -p "$OUTDIR"
# Redirect stdin in a per-process file:
cat > "$OUTDIR"/"$OUTFILE"
exit 0
在外壳中:
$ chmod +x crash.sh
$ mv crash.sh /tmp # Put the script in some place
$ sudo su
# echo '|/tmp/crash.sh %P %e' > /proc/sys/kernel/core_pattern
# cat /proc/sys/kernel/core_pattern
|/tmp/crash.sh %P %e
# exit
$
创建一个崩溃的示例程序(例如fail.c):
int main(void)
{
char *ptr = (char *)0;
*ptr = 'q';
return 0;
}
编译程序(制作多个可执行文件)并在当前shell中调整核心文件大小:
$ gcc fail.c -o fail1
$ gcc fail.c -o fail2
$ ulimit -c
0
$ ulimit -c unlimited
$ ulimit -c
unlimited
多次运行失败的程序以具有多个进程ID:
$ ./fail1
Segmentation fault (core dumped)
$ ./fail2
Segmentation fault (core dumped)
$ ./fail1
Segmentation fault (core dumped)
$ ./fail2
Segmentation fault (core dumped)
查看/ tmp,其中core_pattern重定向核心转储:
$ ls -l /tmp/core
total 8
drwxrwxrwx 2 root root 4096 nov. 3 15:57 fail1
drwxrwxrwx 2 root root 4096 nov. 3 15:57 fail2
$ ls -l /tmp/core/fail1/
total 480
-rw-rw-rw- 1 root root 245760 nov. 3 15:57 core_10606
-rw-rw-rw- 1 root root 245760 nov. 3 15:57 core_10614
$ ls -l /tmp/core/fail2
total 480
-rw-rw-rw- 1 root root 245760 nov. 3 15:57 core_10610
-rw-rw-rw- 1 root root 245760 nov. 3 15:57 core_10618