Question

首先，这是我见过的最奇怪的错误。我不知道是怎么回事。任何人都可以借助正在发生的事情的任何帮助将不胜感激。

我正在编写一个C程序，它将文件读入动态分配的块并对这些块执行各种操作（加密/解密/ MAC等等）。当我在某些（较大的？）文件上运行它时会出现段错误。我想我必须踩到内存一些地方或不正确分配它。所以我在valgrind中运行它来试图找出最新情况并且问题消失了。 Valgrind报告没有错误，程序不会出现段错并按预期工作。

normal run
./threefizer -e -p 1234567 new_name.docx
...
[1]    25017 segmentation fault  ./threefizer -e -p 1234567 new_name.docx

valgrind run
valgrind ./threefizer -e -p 1234567 new_name.docx
==25238== Memcheck, a memory error detector
==25238== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==25238== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==25238== Command: ./threefizer -e -p 1234567 new_name.docx
==25238== 
...
Threefizer operation complete
==25238== 
==25238== HEAP SUMMARY:
==25238==     in use at exit: 0 bytes in 0 blocks
==25238==   total heap usage: 25 allocs, 25 frees, 977,995 bytes allocated
==25238== 
==25238== All heap blocks were freed -- no leaks are possible
==25238== 
==25238== For counts of detected and suppressed errors, rerun with: -v
==25238== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

这是怎么回事？我之前看到valgrind会阻止段错误，因为它在自己的虚拟机中分配了不同的内存。但总是当我在valgrind报告至少1次某种错误之前发生这种情况时。

深入研究gdb我发现当我的程序试图修改从文件读取中填充的块的内容时，会发生段错误。

我的文件读取功能看起来像这样任何人都看错了吗？功能上似乎没问题。

uint8_t* readBlock(const uint64_t data_size, const FILE* read)
{
    pdebug("readBlock()\n");
    if(ferror(read))
    {
        fclose(read);
        perror("Error reading block\n");
        return NULL;
    }

    const uint8_t* data = calloc(data_size, sizeof(uint8_t));
    const uint64_t size = fread(data, sizeof(uint8_t), data_size, read);

    if(ferror(read))
    {
        fclose(read);
        perror("Error reading block\n");
        return NULL;
    }

    if(size == data_size)
    {
        return data;
    }

    perror("Unable to read requested number of bytes\n");
    free(data);
    return NULL;
}

gdb会话显示导致我的程序出现段错误的读取。

***Before read***
queueFile()
Breakpoint 1, queueFile (args=0x7fffffffe460, out=0x62c030)
at controller.c:172
//this is where the data is being read and allocated
172                       data_chunk->data = pad(readBlock(orig_file_size, read),
(gdb) print data_chunk->data
$1 = (uint64_t *) 0x0 //the brand new pointer is null as it should be with nothing allocated

***After read***
(gdb) next
readBlock()
176                       data_chunk->data_size = getPadSize(orig_file_size, args>state_size);
(gdb) print data_chunk->data
$4 = (uint64_t *) 0xfffffffff7ee7010 //why is this memory address so big?
(gdb) print data_chunk->data[0]
Cannot access memory at address 0xfffffffff7ee7010 //can't access memory WTF?
//When this pointer is getting passed to anything else that attempts to access or modify it then the program segfaults

当程序尝试对readBlock（）中的数据执行任何操作时，会出现段错误。在这种情况下，它试图加密它。

***Interestingly the address that is causing the program to segfault is 12 hex digits instead of the regular 6 why?***
Program received signal SIGSEGV, Segmentation fault.
0x0000000000402413 in cbc512Encrypt (key=0x62be78 <runThreefizer.tf_key>, 
iv=0x62c2b0, plain_text=0xfffffffff7ee7010, num_blocks=7611) at cbc.c:206
206     plain_text[0] ^= iv[0]; plain_text[1] ^= iv[1]; plain_text[2] ^= iv[2]; plain_text[3] ^= iv[3];

当我尝试访问readBlock（）中的相同内存时使用gdb我可以正常访问它并且它包含正在读取的文件的正确内容。

***GDB session showing the readBlock() that runs before segfault***
(gdb) continue
Continuing.
readBlock()

Breakpoint 2, readBlock (data_size=487073, read=0x62c360) at fileIO.c:40
40          return data;
(gdb) print data
$7 = (uint8_t *) 0x7ffff7f5e010 "PK\003\004\024" //again whats with the giant memory address all the other ones are only 6 hex digits
(gdb) print size
$8 = 487073
(gdb) print data[0] //as you can see we can access the data just fine and its contents correspond to the first 8 characters of the file that was read
$9 = 80 'P'
(gdb) print data[1]
$10 = 75 'K'
(gdb) print data[2]
$11 = 3 '\003'
(gdb) print data[3]
$12 = 4 '\004'
(gdb) print data[4]
$13 = 20 '\024'
(gdb) print data[5]
$14 = 0 '\000'
(gdb) print data[6]
$15 = 6 '\006'
(gdb) print data[7]
$16 = 0 '\000'
(gdb) 

***The same memory address is inaccessible from the function that calls readBlock()***
(gdb) break controller.c:173
Breakpoint 3 at 0x4038fa: file controller.c, line 173.
(gdb) continue
Continuing.

Breakpoint 3, queueFile (args=0x7fffffffe460, out=0x62c030)
at controller.c:176
176                       data_chunk->data_size = getPadSize(orig_file_size, args->state_size);
(gdb) print data_chunk->data
$17 = (uint64_t *) 0xfffffffff7ee7010 //this is the same address as data in readBlock() and also returned by readBlock()
(gdb) print data_chunk->data[0] 
Cannot access memory at address 0xfffffffff7ee7010 //WHY NOT!?

所以你有任何人知道最新情况吗？

Answer 1

我可以冒险猜测吗？您致电readBlock()的地方是否确保包含readBlock的原型？我问的原因是在readBlock内，您的data似乎有0x7ffff7f5e010的地址，但在来电者看来它似乎已改为0xfffffffff7ee7010。如果调用者认为readBlock返回一个int（即没有原型），则可能发生这种情况。

我从评论中复制了这个，因为它解决了你的问题。顺便说一下，+1是少数几个使用调试器的人之一，并试图用它来隔离他们的问题。

Calloc返回一个地址，导致我的程序出现段错误

1 个答案: