Why can I not read Big Endian ELF files the same way as Little Endian files in C++?

时间:2018-03-22 23:33:50

标签: c++ elf readelf

Essentially, I am doing something similar to https://wiki.osdev.org/ELF_Tutorial, where I load the data into structs and read the various sections by their offsets. The host is little endian and I'm trying to analyze files that were cross-compiled for a big endian target. I tried doing the same code sequence with these big endian files as with the little endian files, but the code segfaults when trying to access the sections.

int fd = open(filename, O_RDONLY);
char *header_start = (char *)mmap(0, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
Elf32_Ehdr* elf_ehdr = (Elf32_Ehdr *)header_start;
Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)((int)header_start + elf_ehdr->e_shoff);
Elf32_Shdr* sh_strtab = &elf_shdrs[elf_ehdr->e_shstrndx];
// code segfaults here when trying to access sh_strtab->sh_offset for big endian
// files, but works just fine for little endian files

Why does the code fail for big endian files?

1 个答案:

答案 0 :(得分:1)

在大端文件中elf_ehdr->e_shoff将是一个大端整数,并且需要遵守大端字节顺序。

假设我们以32位进行处理,e_shoff是一个很好的小数字,如64.在big endian中,它将被记录在文件中,如0x00000040。但是你正在读取这个看起来像小端CPU的文件,所以0x00000040作为二进制blob从文件中读出,并由CPU解释为1073741824.

Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)((int)header_start + elf_ehdr->e_shoff);

解析为

Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)((int)header_start + 1073741824);

Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)((int)header_start + 64);

并且将大幅度地错过目标。试图访问生成的elf_shdrs的成员徘徊到未定义的行为。

快速修复是

Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)(header_start + ResolveEndian(elf_ehdr->e_shoff));

其中ResolveEndian是一系列重载函数,由于文件endian与系统字节序匹配或翻转字节顺序,因此它们绝对不会执行任何操作。有关如何执行此操作的许多示例,请参阅How do I convert between big-endian and little-endian values in C++?

较长的修复程序不会使用内存映射文件,而是deserialize the file考虑​​32位和64位程序以及endian之间的可变大小差异(以及由此产生的偏移差异)。这将导致一个更强大和可移植的解析器,无论源ELF和用于构建解析器的编译器实现如何,它都将始终有效。