Question

初学者程序中最常见的错误可能是它包含seg错误。但是，在使用gdb进行调试时，我无法确切知道seg故障是什么以及它究竟是如何发生的。例如，gdb调试器会发出ssh或program received SEGMENTATION FAULT。（实际上我相信程序不会因为bool变量定义而接收到seg错误。必须有其他一些东西）类似的东西。

但这不是具体的，也不足以提供足够的信息。我想确切地知道哪个变量导致seg故障以及它的位置。例如，如果我在没有初始化的函数中定义指针program receive SEGMENTATION FAULT, in main.c:13: bool flag=false，然后在之后使用它，大多数情况下我会收到seg错误。我希望gdb确切地告诉我，导致seg错误的变量A及其值和位置是......

有什么想法吗？

Answer 1

好吧，我将举一个简单的例子如下。

#include <stdio.h>

void main()
{
int *p=NULL;
printf("I Should be coring now");
printf("%d", *p);
}

正如您可能猜到的那样，printf("%d", *p);中存在问题，但在gdb的输出中无法清楚地显示。

bash-4.1$ gdb test
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.SCLC6_4.1.R1.1.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /tmp/test...done.
(gdb) run
Starting program: /tmp/test

Program received signal SIGSEGV, Segmentation fault.
0x080483e6 in main () at test.c:7
7       printf("%d", *p);

在这个微不足道的案例中，这是非常直截了当的。但为了名义，让我们假设，我们需要更多信息。因此，现在让我们尝试在pc发生时找到SIGSEGV。 pc有效地说，在执行特定指令时，核心发生了。

Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.SCLC6_4.5.R1.1.1.i686
(gdb) info registers
eax            0x0      0
ecx            0xbffff6f8       -1073744136
edx            0x2c5340 2904896
ebx            0x2c3ff4 2899956
esp            0xbffff710       0xbffff710
ebp            0xbffff738       0xbffff738
esi            0x0      0
edi            0x0      0
eip            0x80483e6        0x80483e6 <main+34>
eflags         0x10292  [ AF SF IF RF ]
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0x7b     123
fs             0x0      0
gs             0x33     51

因此，在我们的案例中，pc为eip 0x80483e6 0x80483e6 <main+34>

现在很清楚，在执行与main偏移34的指令时，核心已经发生。所以现在，重要的问题是用高级语言映射这个指令。其中使用disasm命令如下。

(gdb) disas /m main
Dump of assembler code for function main:
4       {
   0x080483c4 <+0>:     push   %ebp
   0x080483c5 <+1>:     mov    %esp,%ebp
   0x080483c7 <+3>:     and    $0xfffffff0,%esp
   0x080483ca <+6>:     sub    $0x20,%esp

5       int *p=NULL;
   0x080483cd <+9>:     movl   $0x0,0x1c(%esp)

6       printf("I Should be coring now");
   0x080483d5 <+17>:    mov    $0x80484c4,%eax
   0x080483da <+22>:    mov    %eax,(%esp)
   0x080483dd <+25>:    call   0x80482f4 <printf@plt>

7       printf("%d", *p);
   0x080483e2 <+30>:    mov    0x1c(%esp),%eax
=> 0x080483e6 <+34>:    mov    (%eax),%edx
   0x080483e8 <+36>:    mov    $0x80484db,%eax
   0x080483ed <+41>:    mov    %edx,0x4(%esp)
   0x080483f1 <+45>:    mov    %eax,(%esp)
   0x080483f4 <+48>:    call   0x80482f4 <printf@plt>

8       }
   0x080483f9 <+53>:    leave
   0x080483fa <+54>:    ret

End of assembler dump.
(gdb)

虽然这可能不是开箱即用的，但就我的经验而言，它的准确率接近90％。

Answer 2

Backtrace（bt）与变量的源/打印输出行通常足以诊断导致问题的原因。为此，您需要使用调试信息进行构建，而无需进行大多数优化（GCC中为-O0 -g3）。

或尝试使用其他工具，例如Valgrind，这样可以更轻松地诊断内存分配问题。

想知道在gdb中接收seg故障时到底发生了什么

2 个答案: