每个进程可配置的核心转储目录

时间:2010-04-23 11:41:08

标签: c++ linux debugging coredump

有没有办法配置为特定进程放置核心转储文件的目录?

我有一个用C ++编写的守护进程,我想为其配置核心转储目录。可选地,文件名模式也应该是可配置的。

我知道/proc/sys/kernel/core_pattern,但这会改变模式和目录结构全局

Apache有指令CoreDumpDirectory - 所以它似乎是可能的。

2 个答案:

答案 0 :(得分:14)

不,您无法按流程设置它。如果模式包含目录,核心文件将被转储到进程的当前工作目录或/ proc / sys / kernel / core_pattern中设置的目录。

apache中的CoreDumpDirectory是一个hack,apache为所有导致核心转储的信号注册信号处理程序,并更改其信号处理程序中的当前目录。

/* handle all varieties of core dumping signals */
static void sig_coredump(int sig)
{
    apr_filepath_set(ap_coredump_dir, pconf);
    apr_signal(sig, SIG_DFL);
#if AP_ENABLE_EXCEPTION_HOOK
    run_fatal_exception_hook(sig);
#endif
    /* linuxthreads issue calling getpid() here:
     *   This comparison won't match if the crashing thread is
     *   some module's thread that runs in the parent process.
     *   The fallout, which is limited to linuxthreads:
     *   The special log message won't be written when such a
     *   thread in the parent causes the parent to crash.
     */
    if (getpid() == parent_pid) {
        ap_log_error(APLOG_MARK, APLOG_NOTICE,
                     0, ap_server_conf,
                     "seg fault or similar nasty error detected "
                     "in the parent process");
        /* XXX we can probably add some rudimentary cleanup code here,
         * like getting rid of the pid file.  If any additional bad stuff
         * happens, we are protected from recursive errors taking down the
         * system since this function is no longer the signal handler   GLA
         */
    }
    kill(getpid(), sig);
    /* At this point we've got sig blocked, because we're still inside
     * the signal handler.  When we leave the signal handler it will
     * be unblocked, and we'll take the signal... and coredump or whatever
     * is appropriate for this particular Unix.  In addition the parent
     * will see the real signal we received -- whereas if we called
     * abort() here, the parent would only see SIGABRT.
     */
}

答案 1 :(得分:0)

可以使用core_pattern文件的“ | command”机制来实现。执行的命令可以根据需要创建目录和文件。可以在参数中通过以下说明符传递命令(参见man 5 core):

%% 是单个%字符
%c 崩溃过程的核心文件大小软资源限制
%d 转储模式-与prctl(2)PR_GET_DUMPABLE
返回的值相同 %e 可执行文件名(无路径前缀)
%E 可执行文件的路径名,用斜杠('/')替换为感叹号('!')
%g (数字)转储进程的实际GID
%h 主机名(与uname(2)返回的节点名相同)
%i 触发核心转储的线程的TID,如线程所在的PID名称空间中所示
%I 触发核心转储的线程的TID,如初始PID名称空间中所示
%p 转储进程的PID,如该进程所在的PID名称空间中所示
%P 转储进程的PID,如初始PID名称空间中所示
%s 引起转储的信号数
%t 的转储时间,表示为自1970年1月1日00:00:00 +0000(UTC)起的秒数
%u (数字)转储进程的实际UID

例如,可以如下创建脚本(例如名为crash.sh):

#!/bin/bash

# $1: process number on host side (%P)
# $2: program's name (%e)

OUTDIR=/tmp/core/$2
OUTFILE="core_$1"

# Create a sub-directory in /tmp
mkdir -p "$OUTDIR"

# Redirect stdin in a per-process file:
cat > "$OUTDIR"/"$OUTFILE"

exit 0

在外壳中:

$ chmod +x crash.sh
$ mv crash.sh /tmp  # Put the script in some place
$ sudo su
# echo '|/tmp/crash.sh %P %e' > /proc/sys/kernel/core_pattern
# cat /proc/sys/kernel/core_pattern
|/tmp/crash.sh %P %e
# exit
$

创建一个崩溃的示例程序(例如fail.c):

int main(void)
{
  char *ptr = (char *)0;

  *ptr = 'q';

  return 0;

}

编译程序(制作多个可执行文件)并在当前shell中调整核心文件大小:

$ gcc fail.c -o fail1
$ gcc fail.c -o fail2
$ ulimit -c
0
$ ulimit -c unlimited
$ ulimit -c
unlimited

多次运行失败的程序以具有多个进程ID:

$ ./fail1
Segmentation fault (core dumped)
$ ./fail2
Segmentation fault (core dumped)
$ ./fail1
Segmentation fault (core dumped)
$ ./fail2
Segmentation fault (core dumped)

查看/ tmp,其中core_pattern重定向核心转储:

$ ls -l /tmp/core
total 8
drwxrwxrwx 2 root root 4096 nov.    3 15:57 fail1
drwxrwxrwx 2 root root 4096 nov.    3 15:57 fail2
$ ls -l /tmp/core/fail1/
total 480
-rw-rw-rw- 1 root root 245760 nov.    3 15:57 core_10606
-rw-rw-rw- 1 root root 245760 nov.    3 15:57 core_10614
$ ls -l /tmp/core/fail2 
total 480
-rw-rw-rw- 1 root root 245760 nov.    3 15:57 core_10610
-rw-rw-rw- 1 root root 245760 nov.    3 15:57 core_10618