打开文件，初始化

Question

我想在这里收集在Windows，Linux和OSX上运行可执行文件时会发生什么。特别是，我想完全理解操作的顺序：我的猜测是内核加载了可执行文件格式（PE，ELF或Mach-O）（但我忽略了{{3}的各个部分（可执行和可链接的格式）及其含义），然后你有动态链接器解析引用，然后运行可执行文件的__init部分，然后是main，然后是__fini，然后程序就完成了，但我确信它非常粗糙，可能是错误的。

编辑：问题现在是CW。我正在填补linux。如果有人想为Win和OSX做同样的事情那就太棒了。

Answer 1

当然，这只是一个非常高的抽象层次！

Executable - No Shared Libary: 

Client request to run application
  ->Shell informs kernel to run binary
  ->Kernel allocates memory from the pool to fit the binary image into
  ->Kernel loads binary into memory
  ->Kernel jumps to specific memory address
  ->Kernel starts processing the machine code located at this location
  ->If machine code has stop
  ->Kernel releases memory back to pool

Executable - Shared Library

Client request to run application
  ->Shell informs kernel to run binary
  ->Kernel allocates memory from the pool to fit the binary image into
  ->Kernel loads binary into memory
  ->Kernel jumps to specific memory address
  ->Kernel starts processing the machine code located at this location
  ->Kernel pushes current location into an execution stack
  ->Kernel jumps out of current memory to a shared memory location
  ->Kernel executes code from this shared memory location
  ->Kernel pops back the last memory location and jumps to that address
  ->If machine code has stop
  ->Kernel releases memory back to pool

JavaScript/.NET/Perl/Python/PHP/Ruby (Interpretted Languages)

Client request to run application
  ->Shell informs kernel to run binary
  ->Kernel has a hook that recognises binary images needs a JIT
  ->Kernel calls JIT
  ->JIT loads the code and jumps to a specific address
  ->JIT reads the code and compiles the instruction into the 
    machine code that the interpretter is running on
  ->Interpretture passes machine code to the kernel
  ->kernel executes the required instruction
  ->JIT then increments the program counter
  ->If code has a stop
  ->Jit releases application from its memory pool

正如routeNpingme所说，寄存器设置在CPU内部并且魔术发生了！

更新：是的，我今天不能正确拼写！

Answer 2

好的，回答我自己的问题。这将逐步完成，仅适用于Linux（也许是Mach-O）。随意添加更多的东西到你的个人答案，以便他们得到投票（你可以获得徽章，因为它现在是CW）。

我会中途开始，并在我发现的时候构建其余部分。本文档使用x86_64，gcc（GCC）4.1.2。

打开文件，初始化

在本节中，我们将描述从内核的角度调用程序时会发生什么，直到程序准备好执行。

ELF已打开。
内核查找.text部分并将其加载到内存中。将其标记为只读
内核加载.data部分
内核加载.bss部分，并将所有内容初始化为零。
内核将控件传输到动态链接器（其名称在ELF文件内，在.interp部分中）。动态链接器解析所有共享库调用。
控件转移到应用程序

执行程序

函数_start被调用，因为ELF头指定它作为可执行文件的入口点
_start在glibc中调用__libc_start_main（通过PLT）将以下信息传递给它
1. 实际主要功能的地址
2. argc地址
3. argv地址
4. _init例程的地址
5. _fini例程的地址
6. atexit（）注册的函数指针
7. 可用的最高堆栈地址
_init被调用
1. 调用call_gmon_start来初始化gmon分析。与执行无关。
2. 调用frame_dummy，它包装__register_frame_info（eh_frame节地址，bss节地址）（FIXME：这个函数做什么？显然从BSS节初始化全局变量）
3. 调用__do_global_ctors_aux，其作用是调用.ctors部分中列出的所有全局构造函数。
main被称为
主要目的
_fini被调用，它依次调用__do_global_dtors_aux来运行.dtors部分中指定的所有析构函数。
程序退出。

Answer 3

在Windows上，首先将图像加载到内存中。内核分析它将需要哪些库（读取“DLL”）并加载它们。

然后编辑程序映像以插入所需的每个库函数的内存地址。这些地址已经在.EXE二进制文件中有一个空格，但它们只是用零填充。

然后逐个执行每个DLL的DllMain（）过程，从最需要的DLL到最后一个DLL，就像遵循依赖顺序一样。

一旦所有库都被加载并准备就绪，最终图像就会启动，现在发生的任何事情都将取决于所使用的语言，使用的编译器以及程序例程本身。

Answer 4

一旦图像加载到内存中，魔术就会接管。

Answer 5

那么，根据您的确切定义，您必须考虑.Net和Java等语言的JIT编译器。当你运行一个技术上不可执行的.Net“exe”时，JIT编译器会介入并编译它。

运行程序会发生什么？

5 个答案:

打开文件，初始化

执行程序