首先,这是我见过的最奇怪的错误。我不知道是怎么回事。任何人都可以借助正在发生的事情的任何帮助将不胜感激。
我正在编写一个C程序,它将文件读入动态分配的块并对这些块执行各种操作(加密/解密/ MAC等等)。当我在某些(较大的?)文件上运行它时会出现段错误。我想我必须踩到内存一些地方或不正确分配它。所以我在valgrind中运行它来试图找出最新情况并且问题消失了。 Valgrind报告没有错误,程序不会出现段错并按预期工作。
normal run
./threefizer -e -p 1234567 new_name.docx
...
[1] 25017 segmentation fault ./threefizer -e -p 1234567 new_name.docx
valgrind run
valgrind ./threefizer -e -p 1234567 new_name.docx
==25238== Memcheck, a memory error detector
==25238== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==25238== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==25238== Command: ./threefizer -e -p 1234567 new_name.docx
==25238==
...
Threefizer operation complete
==25238==
==25238== HEAP SUMMARY:
==25238== in use at exit: 0 bytes in 0 blocks
==25238== total heap usage: 25 allocs, 25 frees, 977,995 bytes allocated
==25238==
==25238== All heap blocks were freed -- no leaks are possible
==25238==
==25238== For counts of detected and suppressed errors, rerun with: -v
==25238== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
这是怎么回事?我之前看到valgrind会阻止段错误,因为它在自己的虚拟机中分配了不同的内存。但总是当我在valgrind报告至少1次某种错误之前发生这种情况时。
深入研究gdb我发现当我的程序试图修改从文件读取中填充的块的内容时,会发生段错误。
我的文件读取功能看起来像这样任何人都看错了吗?功能上似乎没问题。
uint8_t* readBlock(const uint64_t data_size, const FILE* read)
{
pdebug("readBlock()\n");
if(ferror(read))
{
fclose(read);
perror("Error reading block\n");
return NULL;
}
const uint8_t* data = calloc(data_size, sizeof(uint8_t));
const uint64_t size = fread(data, sizeof(uint8_t), data_size, read);
if(ferror(read))
{
fclose(read);
perror("Error reading block\n");
return NULL;
}
if(size == data_size)
{
return data;
}
perror("Unable to read requested number of bytes\n");
free(data);
return NULL;
}
gdb会话显示导致我的程序出现段错误的读取。
***Before read***
queueFile()
Breakpoint 1, queueFile (args=0x7fffffffe460, out=0x62c030)
at controller.c:172
//this is where the data is being read and allocated
172 data_chunk->data = pad(readBlock(orig_file_size, read),
(gdb) print data_chunk->data
$1 = (uint64_t *) 0x0 //the brand new pointer is null as it should be with nothing allocated
***After read***
(gdb) next
readBlock()
176 data_chunk->data_size = getPadSize(orig_file_size, args>state_size);
(gdb) print data_chunk->data
$4 = (uint64_t *) 0xfffffffff7ee7010 //why is this memory address so big?
(gdb) print data_chunk->data[0]
Cannot access memory at address 0xfffffffff7ee7010 //can't access memory WTF?
//When this pointer is getting passed to anything else that attempts to access or modify it then the program segfaults
当程序尝试对readBlock()中的数据执行任何操作时,会出现段错误。在这种情况下,它试图加密它。
***Interestingly the address that is causing the program to segfault is 12 hex digits instead of the regular 6 why?***
Program received signal SIGSEGV, Segmentation fault.
0x0000000000402413 in cbc512Encrypt (key=0x62be78 <runThreefizer.tf_key>,
iv=0x62c2b0, plain_text=0xfffffffff7ee7010, num_blocks=7611) at cbc.c:206
206 plain_text[0] ^= iv[0]; plain_text[1] ^= iv[1]; plain_text[2] ^= iv[2]; plain_text[3] ^= iv[3];
当我尝试访问readBlock()中的相同内存时使用gdb我可以正常访问它并且它包含正在读取的文件的正确内容。
***GDB session showing the readBlock() that runs before segfault***
(gdb) continue
Continuing.
readBlock()
Breakpoint 2, readBlock (data_size=487073, read=0x62c360) at fileIO.c:40
40 return data;
(gdb) print data
$7 = (uint8_t *) 0x7ffff7f5e010 "PK\003\004\024" //again whats with the giant memory address all the other ones are only 6 hex digits
(gdb) print size
$8 = 487073
(gdb) print data[0] //as you can see we can access the data just fine and its contents correspond to the first 8 characters of the file that was read
$9 = 80 'P'
(gdb) print data[1]
$10 = 75 'K'
(gdb) print data[2]
$11 = 3 '\003'
(gdb) print data[3]
$12 = 4 '\004'
(gdb) print data[4]
$13 = 20 '\024'
(gdb) print data[5]
$14 = 0 '\000'
(gdb) print data[6]
$15 = 6 '\006'
(gdb) print data[7]
$16 = 0 '\000'
(gdb)
***The same memory address is inaccessible from the function that calls readBlock()***
(gdb) break controller.c:173
Breakpoint 3 at 0x4038fa: file controller.c, line 173.
(gdb) continue
Continuing.
Breakpoint 3, queueFile (args=0x7fffffffe460, out=0x62c030)
at controller.c:176
176 data_chunk->data_size = getPadSize(orig_file_size, args->state_size);
(gdb) print data_chunk->data
$17 = (uint64_t *) 0xfffffffff7ee7010 //this is the same address as data in readBlock() and also returned by readBlock()
(gdb) print data_chunk->data[0]
Cannot access memory at address 0xfffffffff7ee7010 //WHY NOT!?
所以你有任何人知道最新情况吗?
答案 0 :(得分:2)
我可以冒险猜测吗?您致电readBlock()
的地方是否确保包含readBlock
的原型?我问的原因是在readBlock
内,您的data
似乎有0x7ffff7f5e010
的地址,但在来电者看来它似乎已改为0xfffffffff7ee7010
。如果调用者认为readBlock返回一个int(即没有原型),则可能发生这种情况。
我从评论中复制了这个,因为它解决了你的问题。顺便说一下,+1是少数几个使用调试器的人之一,并试图用它来隔离他们的问题。