GCC优化会导致奇怪的问题 - 我该如何调试?

时间:2011-08-16 18:31:15

标签: c macos optimization gcc

我在C程序中遇到GCC优化问题。它是x86_64代码(C和内联汇编,尽管.o文件中没有asm很麻烦),在OS X 10.6上运行。

以下是使用-O0 vs -O2运行时的外观:

$ gcc -m64 -std=gnu99 -o bin/ctr keyschedule.c aes.c ctr.c debug.c misc.c -O0 -Wall
$ bin/ctr -e testing/plaintext -o testing/cipher
encrypt_file called with inpath: testing/plaintext
encrypt_file called with outpath: testing/cipher
file_size called with argument 0x7fff5fbff84b
nonce = 6e32b6797e849081
(... no more output, but it works as it should)

$ gcc -m64 -std=gnu99 -o bin/ctr keyschedule.c aes.c ctr.c debug.c misc.c -O2 -Wall
$ bin/ctr -e testing/plaintext -o testing/cipher
encrypt_file called with inpath: testing/plaintext
encrypt_file called with outpath: testing/cipher
file_size called with argument 0x7fff5f000000
Segmentation fault

...随着OS X崩溃记者弹出:

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_PROTECTION_FAILURE at 0x00007fff5f000000
Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   libSystem.B.dylib               0x00007fff88d3bad8 perror + 52
1   ctr                             0x0000000100001339 file_size + 73 (ctr.c:32)
2   ctr                             0x00000001000017be encrypt_file + 158 (ctr.c:72)
3   ctr                             0x0000000100001bd1 main + 289 (ctr.c:242)
4   ctr                             0x0000000100000bd4 start + 52

...这里是大部分需要的代码(我们只是说参数解析器是暂时的):

int main(int argc, char *argv[]) {
...
    if (strcmp(argv[1], "-e") == 0)
        encrypt_file(argv[2], argv[4], key); // It crashes with static paths as well
...
}

void encrypt_file(const char *inpath, const char *outpath, const unsigned char *key) {
    printf("encrypt_file called with inpath: %s\n", inpath);
    printf("encrypt_file called with outpath: %s\n", outpath);

    // Create a pointer to the correct function to use for this CPU
    void (*aes_encrypt)(const unsigned char *, unsigned char *, const unsigned char *);
    if (test_aesni_support()) {
        aes_encrypt = aes_encrypt_aesni;
    }
    else {
        aes_encrypt = aes_encrypt_c;
    }

    unsigned char expanded_keys[176] = {0};
    aes_expand_key(key, expanded_keys);

    off_t size = file_size(inpath);
    if (size <= 0) {
        fprintf(stderr, "Cannot encrypt a file of size zero!\n");
        exit(1);
    }
    ....
}

off_t file_size(const char *path) {
    printf("file_size called with argument %p\n", path);
    struct stat st;
    if (stat(path, &st) != 0) {
        perror(path);
        exit(1);
    }

    return (st.st_size);
}

我怀疑我的代码中有一些东西(而不是假设错误是在GCC中),但我找不到任何东西。

如果上面的代码看起来很好(我无法想象问题出现在任何其他代码中,因为它与崩溃完全无关),我应该怎么做才能跟踪它?

GDB:

#0  0x00007fff88d3bad8 in perror ()
(gdb) bt
#0  0x00007fff88d3bad8 in perror ()
#1  0x0000000100001339 in file_size (path=0x7fff5f000000 "") at ctr.c:31
#2  0x00000001000017be in encrypt_file (inpath=0x7fff5f000000 "", outpath=0x7fff5fbff860 "testing/cipher", key=0x7fff5fbff6b0 "-~??9?9>?W\n\021\001?N\026") at ctr.c:72
#3  0x0000000100001bd1 in main (argc=<value temporarily unavailable, due to optimizations>, argv=0x7fff5fbff718) at ctr.c:242
(gdb) frame 2
#2  0x00000001000017be in encrypt_file (inpath=0x7fff5f000000 "", outpath=0x7fff5fbff860 "testing/cipher", key=0x7fff5fbff6b0 "-~??9?9>?W\n\021\001?N\026") at ctr.c:72
72      off_t size = file_size(inpath);
(gdb) l
67      }
68  
69      unsigned char expanded_keys[176] = {0};
70      aes_expand_key(key, expanded_keys);
71  
72      off_t size = file_size(inpath);
73      if (size <= 0) {
74          fprintf(stderr, "Cannot encrypt a file of size zero!\n");
75          exit(1);
76      }

Valgrind的:

$ valgrind bin/ctr -e testing/plaintext -o testing/cipher
==1165== Memcheck, a memory error detector
==1165== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==1165== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info
==1165== Command: bin/ctr -e testing/plaintext -o testing/cipher
==1165== 
encrypt_file called with inpath: testing/plaintext
encrypt_file called with outpath: testing/cipher
file_size called with argument 0x7fff5f000000
==1165== Syscall param stat64(path) points to unaddressable byte(s)
==1165==    at 0x2BB52: stat$INODE64 (in /usr/lib/libSystem.B.dylib)
==1165==    by 0x10000179D: encrypt_file (ctr.c:73) // this is the call to file_size() in encrypt_file()
==1165==    by 0x100001BB0: main (ctr.c:243)
==1165==  Address 0x7fff5f000000 is not stack'd, malloc'd or (recently) free'd
==1165== 
file_size: Bad address
==1165== 
==1165== HEAP SUMMARY:
==1165==     in use at exit: 4,184 bytes in 2 blocks
==1165==   total heap usage: 2 allocs, 0 frees, 4,184 bytes allocated
==1165== 
==1165== LEAK SUMMARY:
==1165==    definitely lost: 0 bytes in 0 blocks
==1165==    indirectly lost: 0 bytes in 0 blocks
==1165==      possibly lost: 0 bytes in 0 blocks
==1165==    still reachable: 4,096 bytes in 1 blocks
==1165==         suppressed: 88 bytes in 1 blocks
==1165== Rerun with --leak-check=full to see details of leaked memory
==1165== 
==1165== For counts of detected and suppressed errors, rerun with: -v
==1165== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

编辑:用未删节的版本替换encrypt_file()(直到它崩溃),添加GDB信息并用运行-g二进制文件的输出替换OS X崩溃报告器输出。

编辑2:添加了valgrind输出。

2 个答案:

答案 0 :(得分:3)

您的变量inpath似乎已被(部分)覆盖;我怀疑,如果您取消对aes_expand_key的调用,它将解决问题。

在没有看到aes_expand_key的情况下,我猜测它有一些内联汇编,你没有正确保留至少一个应该根据目标平台的ABI保留的寄存器。在-O0情况下,inpath可能保留在堆栈上,因此不受影响,而在-O2情况下,它可能被保存在一个寄存器中,其中低32位被{{1}打破}。

答案 1 :(得分:0)

听起来你在某处发生了某种记忆困扰。 path中的参数file_size显然是指向它时的无效指针。

如果您为encrypt_file发布的内容是在调用file_size之前该功能发生的所有事情,那么罪魁祸首几乎肯定是aes_expand_key - 你确定你已经过了一个正确大小的176字节缓冲区?如果该缓冲区太小,则aes_expand_key将覆盖堆栈上的其他数据,包括本地变量inpath。尝试增加expanded_keys的大小;查看aes_expand_key上的文档,了解它需要有多大。