使用GDB进行MPI调试 - 无符号" i"在当前的背景下

时间:2014-12-29 16:07:19

标签: c ubuntu gdb mpi

我需要调试用C编写的MPI应用程序。我想使用手动附加GDB的系统进行处理,因为它推荐here(第6段)。

问题是,当我尝试打印变量" i"的值时,我收到此错误:

No symbol "i" in current context.

同样的问题是set var i=5。当我尝试运行info local时,它只是声明"没有区域设置"。

  • 系统 Ubuntu 14.04
  • MPICC cc(Ubuntu 4.8.2-19ubuntu1)4.8.2
  • GDB GNU gdb(Ubuntu 7.7.1-0ubuntu5~14.04.2)7.7.1。

我使用命令

编译我的代码
mpicc -o hello hello.c

并使用

执行它
mpiexec -n 2 ./hello

我试图寻找这个问题,但解决方案通常不是在GCC中使用任何优化(-O)选项,但它对我没用,因为我没有'在这里使用它们中的任何一个,我用MPICC编译。我已经尝试过宣布"我"变量为 volatile ,并使用mpicc-g启动-O0,但没有任何帮助。


DBG消息

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1

Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 3778
Reading symbols from /home/martin/Dokumenty/Programovani/mpi_trenink/hello...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libmpich.so.10...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/x86_64-linux-gnu/libmpich.so.10
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /usr/lib/x86_64-linux-gnu/libmpl.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/x86_64-linux-gnu/libmpl.so.1
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/librt-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/librt.so.1
Reading symbols from /usr/lib/libcr.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libcr.so.0
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libpthread-2.19.so...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
Reading symbols from /lib/x86_64-linux-gnu/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libgcc_s.so.1
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.19.so...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libdl-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libdl.so.2
Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libnss_files-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libnss_files.so.2
0x00007f493e53c9a0 in __nanosleep_nocancel ()
    at ../sysdeps/unix/syscall-template.S:81
81  ../sysdeps/unix/syscall-template.S: No such file or directory.

我的代码

#include <stdio.h>
#include <mpi.h>

#include <unistd.h> // sleep()

int main(){
    MPI_Init(NULL, NULL);

    /* DEBUGGING STOP */

    int i = 0;
    while(i == 0){
        sleep(30);
    }

    int world_size;
    MPI_Comm_size( MPI_COMM_WORLD, &world_size );

    int process_id; // casto znaceno jako 'world_rank'
    MPI_Comm_rank( MPI_COMM_WORLD, &process_id );

    char processor_name[ MPI_MAX_PROCESSOR_NAME ];
    int name_len;
    MPI_Get_processor_name( processor_name, &name_len );

    printf("Hello! - sent from process %d running on processor %s.\n\
        Number of processors is %d.\n\
        Length of proc name is %d.\n\
        ***********************\n",
        process_id, processor_name, world_size, name_len);

    MPI_Finalize();
    return 0;
}

2 个答案:

答案 0 :(得分:4)

GDB很有可能在深入实现sleep(3)函数的过程中打破这个过程。您可以通过首先发出bt(回溯)命令来检查:

(gdb) bt
#0  0x00000030e0caca3d in nanosleep () from /lib64/libc.so.6
#1  0x00000030e0cac8b0 in sleep () from /lib64/libc.so.6
#2  0x0000000000400795 in main (argc=1, argv=0x7fff64ae4688) at sleeper.c:9
<{1}}

的框架中不存在

i

nanosleep

通过发出(gdb) info locals No symbol table info available. 命令选择main函数的堆栈帧(其中frame x是帧号,在显示的示例中为x)。

2

(gdb) f 2 #2 0x0000000000400795 in main (argc=1, argv=0x7fff64ae4688) at sleeper.c:9 9 while(i == 0) { sleep(30); } 现在应该在那里:

i

如果GDB碰巧连接到错误的线程,您可能还需要更改活动线程。许多MPI库产生额外的线程,例如使用英特尔MPI:

(gdb) info locals
i = 0

标有(gdb) info threads 3 Thread 0x7f8b9fada700 (LWP 39085) 0x00000030e0cdf1b3 in poll () from /lib64/libc.so.6 2 Thread 0x7f8b9f0d9700 (LWP 39087) 0x00000030e0cdf1b3 in poll () from /lib64/libc.so.6 * 1 Thread 0x7f8ba1b51700 (LWP 39066) 0x00000030e0caca3d in nanosleep () from /lib64/libc.so.6 的主题是正在审核的主题。如果某个其他线程处于活动状态,请使用*命令切换到主线程。

答案 1 :(得分:3)

我终于解决了这个问题。关键是我必须先用向上命令检查某个帧的内容,然后再尝试打印变量“i”或更改其值。


逐步解决方案

  1. 使用mpicc -o hello hello.c -g -O0编译此代码。 使用mpiexec -n 2 ./hello启动该计划。

  2. 找出进程ID(PID)。

    • 我使用命令ps -e | grep hello
    • 其他选项仅使用pstree
    • 最后,您可以使用本机Linux函数getpid()
  3. 下一步是打开一个新终端并使用gdb --pid debugged_process_id命令启动GDB。

  4. 现在,在调试器类型bt中。 输出将与此类似:

    #0  0x00007f63667e09a0 in __nanosleep_nocancel ()
    at ../sysdeps/unix/syscall-template.S:81
    #1  0x00007f63667e0854 in __sleep (seconds=0)
    at ../sysdeps/unix/sysv/linux/sleep.c:137
    #2  0x00000000004009ec in main () at hello.c:20
    
  5. 正如我们所看到的,第2段指向代码hello.c,因此我们可以更详细地查看它。输入up 2。 输出将与此类似:

    #2  0x00000000004009ec in main () at hello.c:20
    warning: Source file is more recent than executable.
    20          sleep(30);
    
  6. 最后,现在我们可以打印出这个块中的所有局部变量了。输入info local。 输出:

    i = 0
    world_size = 0
    process_id = 0
    processor_name = "\000\000\000\000\000\000\000\000 5\026gc\177\000\000\200\306Η\377\177\000\000p\306Η\377\177\000\000.N=\366\000\000\000\000\272\005@\000\000\000\000\000\377\377\377\377\000\000\000\000%0`\236\060\000\000\000\250\361rfc\177\000\000x\n\026gc\177\000\000\320\067`\236\060\000\000\000\377\377\377\177\376\377\377\377\001\000\000\000\000\000\000\000\335\n@\000\000\000\000\000\377\377\377\377\377\377\377\377\000\000\000\000\000\000\000"
    name_len = 1718986550
    
  7. 现在我们可以通过set var i=1释放限制器循环并继续调试。