我目前正在编写一个编译器(http://curly-lang.org,如果您很好奇),并且在尝试在最新的Linux内核上运行生成的ELF二进制文件时遇到一个奇怪的错误。相同的二进制文件在较旧的内核上运行良好(我曾在几个Ubuntu机器上尝试过,但都使用4.4.0-1049-aws),但在更新后的Arch机器(uname 4.17.11-arch1)上,我什至无法打开它们在GDB下。
GDB给出的错误消息是During startup program terminated with signal SIGSEGV, Segmentation fault
,据我所知,这表明在运行第一条指令之前无法加载程序段。
我用GCC / NASM编译了一个最小的ELF可执行文件来尝试重现该问题,但是由GCC生成的可执行文件加载得很顺利,而我的程序肯定没有。
以下是两个可执行文件的readelf -a
的打印输出,以供参考。首先是我的编译器生成的程序:
$ readelf -a my-program
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x400040
Start of program headers: 2400 (bytes into file)
Start of section headers: 2568 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 3
Size of section headers: 64 (bytes)
Number of section headers: 6
Section header string table index: 1
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] STRTAB 0000000000000000 00000b88
000000000000009e 0000000000000000 0 0 0
[ 2] .init PROGBITS 0000000000400040 00000040
000000000000006b 0000000000000000 AX 0 0 0
[ 3] .text PROGBITS 00000000004000ab 000000ab
0000000000000824 0000000000000000 AX 0 0 0
[ 4] .data PROGBITS 00000000008008cf 000008cf
0000000000000091 0000000000000000 WA 0 0 0
[ 5] .symtab SYMTAB 0000000000000000 00000c26
00000000000000c0 0000000000000018 1 8 0
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000040 0x0000000000400040 0x0000000000000000
0x000000000000006b 0x000000000000006b R E 0x1000
LOAD 0x00000000000000ab 0x00000000004000ab 0x0000000000000000
0x0000000000000824 0x0000000000000824 R E 0x1000
LOAD 0x00000000000008cf 0x00000000008008cf 0x0000000000000000
0x0000000000000091 0x0000000000000091 RW 0x1000
Section to Segment mapping:
Segment Sections...
00 .init
01 .text
02 .data
There is no dynamic section in this file.
There are no relocations in this file.
The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000004004e0 0 NOTYPE LOCAL HIDDEN 3 .text.argument
1: 00000000004005a0 0 NOTYPE LOCAL HIDDEN 3 .text.constant
2: 0000000000400270 0 NOTYPE LOCAL HIDDEN 3 .text.memextend-page
3: 0000000000400210 0 NOTYPE LOCAL HIDDEN 3 .text.memextend-pool-32
4: 00000000004002b0 0 NOTYPE LOCAL HIDDEN 3 .text.unit
5: 00000000004005f0 0 NOTYPE LOCAL HIDDEN 3 .text.write
6: 00000000008008d0 0 NOTYPE LOCAL HIDDEN 4 .data.brkaddr
7: 0000000000400040 0 NOTYPE LOCAL HIDDEN 2 .init.brkaddr-init
No version information found in this file.
对于GCC生成的程序:
$ readelf -a gcc-program
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x400110
Start of program headers: 64 (bytes into file)
Start of section headers: 336 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 3
Size of section headers: 64 (bytes)
Number of section headers: 5
Section header string table index: 4
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .note.gnu.build-i NOTE 00000000004000e8 000000e8
0000000000000024 0000000000000000 A 0 0 4
[ 2] .text PROGBITS 0000000000400110 00000110
0000000000000010 0000000000000000 AX 0 0 16
[ 3] .data PROGBITS 0000000000600120 00000120
0000000000000001 0000000000000000 WA 0 0 4
[ 4] .shstrtab STRTAB 0000000000000000 00000121
000000000000002a 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000000120 0x0000000000000120 R E 0x200000
LOAD 0x0000000000000120 0x0000000000600120 0x0000000000600120
0x0000000000000001 0x0000000000000001 RW 0x200000
NOTE 0x00000000000000e8 0x00000000004000e8 0x00000000004000e8
0x0000000000000024 0x0000000000000024 R 0x4
Section to Segment mapping:
Segment Sections...
00 .note.gnu.build-id .text
01 .data
02 .note.gnu.build-id
There is no dynamic section in this file.
There are no relocations in this file.
The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
No version information found in this file.
Displaying notes found in: .note.gnu.build-id
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: 1a3e678b08996ee6a9d289c3f76c7c52cd4a30aa
如您所见,我尝试镜像GCC的段位置(代码为〜0x400000,数据为〜0x800000),并且两个ELF标头严格相同。我能想到的唯一有意义的区别是,我的自定义二进制文件具有两个共享同一页面的LOAD段(一个用于初始化代码,一个用于其余代码),而GCC仅生成单个代码LOAD段。不过,这应该不会造成问题,因为它们都共享相同的权限并且不会重叠。
除此之外,我看不出有什么可能阻止第一个程序正确加载。如果任何精通Linux ELF加载程序的奥秘的人能启发我,那将不胜感激。
感谢您的关注,
答案 0 :(得分:1)
不要打扰每个人,一直是导致问题的页面共享部分。
认为问题可能出在内核加载程序中,我应该早些考虑运行dmesg,在那里我会注意到以下消息,以天为准:
[54178.211348] 12766 (my-program): Uhuuh, elf segment at 0000000000400000 requested but the memory is mapped already
很显然,some benevolent mastermind于3个月前决定,实际上捕获双重映射错误而不是像我们在ELF加载程序中那样让它们静默运行是一个好习惯。
这不是我的二进制文件过去是正确的,而是它们引起的错误之前没有被发现。我不知道我是否应该为我的错误一直困扰着我而感到骄傲或羞愧。
无论如何,我留下这个答案来警告任何愚蠢的人,以ELF二进制文件在单个页面上映射多个段:不要。没有尝试。
PS:@rodrigo:感谢您的回答,在您指出它们之前,我什至没有注意到PhysAddr。该手册说,它们是在“与物理寻址相关的系统上使用的”,这里似乎不是这种情况,但我会记得下次要注意它们。