Question

Essentially, I am doing something similar to https://wiki.osdev.org/ELF_Tutorial, where I load the data into structs and read the various sections by their offsets. The host is little endian and I'm trying to analyze files that were cross-compiled for a big endian target. I tried doing the same code sequence with these big endian files as with the little endian files, but the code segfaults when trying to access the sections.

int fd = open(filename, O_RDONLY);
char *header_start = (char *)mmap(0, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
Elf32_Ehdr* elf_ehdr = (Elf32_Ehdr *)header_start;
Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)((int)header_start + elf_ehdr->e_shoff);
Elf32_Shdr* sh_strtab = &elf_shdrs[elf_ehdr->e_shstrndx];
// code segfaults here when trying to access sh_strtab->sh_offset for big endian
// files, but works just fine for little endian files

Why does the code fail for big endian files?

Answer 1

在大端文件中elf_ehdr->e_shoff将是一个大端整数，并且需要遵守大端字节顺序。

假设我们以32位进行处理，e_shoff是一个很好的小数字，如64.在big endian中，它将被记录在文件中，如0x00000040。但是你正在读取这个看起来像小端CPU的文件，所以0x00000040作为二进制blob从文件中读出，并由CPU解释为1073741824.

Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)((int)header_start + elf_ehdr->e_shoff);

解析为

Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)((int)header_start + 1073741824);

不

Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)((int)header_start + 64);

并且将大幅度地错过目标。试图访问生成的elf_shdrs的成员徘徊到未定义的行为。

快速修复是

Elf32_Shdr* elf_shdrs = (Elf32_Shdr *)(header_start + ResolveEndian(elf_ehdr->e_shoff));

其中ResolveEndian是一系列重载函数，由于文件endian与系统字节序匹配或翻转字节顺序，因此它们绝对不会执行任何操作。有关如何执行此操作的许多示例，请参阅How do I convert between big-endian and little-endian values in C++?

较长的修复程序不会使用内存映射文件，而是deserialize the file考虑32位和64位程序以及endian之间的可变大小差异（以及由此产生的偏移差异）。这将导致一个更强大和可移植的解析器，无论源ELF和用于构建解析器的编译器实现如何，它都将始终有效。

Why can I not read Big Endian ELF files the same way as Little Endian files in C++?

1 个答案: