Question

我目前正在通过反汇编C程序并试图了解它们的作用来进行汇编阅读。

我被困在一个简单的问题：一个简单的你好世界计划。

#include <stdio.h>
#include <stdlib.h>

int main() {
  printf("Hello, world!");
  return(0);
}

当我拆开主要部分时：

(gdb) disassemble main
Dump of assembler code for function main:
   0x0000000000400526 <+0>: push   rbp
   0x0000000000400527 <+1>: mov    rbp,rsp
   0x000000000040052a <+4>: mov    edi,0x4005c4
   0x000000000040052f <+9>: mov    eax,0x0
   0x0000000000400534 <+14>:    call   0x400400 <printf@plt>
   0x0000000000400539 <+19>:    mov    eax,0x0  
   0x000000000040053e <+24>:    pop    rbp
   0x000000000040053f <+25>:    ret

我理解前两行：基本指针保存在堆栈上（通过push rbp，这会导致堆栈指针的值减少8，因为它有＆＃34;增长＆＃34;）和堆栈指针的值保存在基指针中（这样，参数和局部变量可以分别通过正偏移和负偏移轻松到达，而堆栈可以保持＆＃34;增长＆＃34;）。

第三行提出了第一个问题：为什么0x4005c4（＆＃34; Hello，World！＆＃34;字符串的地址）在edi寄存器中移动而不是在堆栈中移动？ printf函数不应该将该字符串的地址作为参数吗？据我所知，函数从堆栈中获取参数（但在这里，看起来参数放在该寄存器中：edi）

在StackOverflow上的另一篇文章中，我读到了＆＃34; printf @ ptl＆＃34;就像一个调用真正的printf函数的存根函数。我试图反汇编这个功能，但它更令人困惑：

(gdb) disassemble printf
Dump of assembler code for function __printf:
   0x00007ffff7a637b0 <+0>: sub    rsp,0xd8
   0x00007ffff7a637b7 <+7>: test   al,al
   0x00007ffff7a637b9 <+9>: mov    QWORD PTR [rsp+0x28],rsi
   0x00007ffff7a637be <+14>:    mov    QWORD PTR [rsp+0x30],rdx
   0x00007ffff7a637c3 <+19>:    mov    QWORD PTR [rsp+0x38],rcx
   0x00007ffff7a637c8 <+24>:    mov    QWORD PTR [rsp+0x40],r8
   0x00007ffff7a637cd <+29>:    mov    QWORD PTR [rsp+0x48],r9
   0x00007ffff7a637d2 <+34>:    je     0x7ffff7a6380b <__printf+91>
   0x00007ffff7a637d4 <+36>:    movaps XMMWORD PTR [rsp+0x50],xmm0
   0x00007ffff7a637d9 <+41>:    movaps XMMWORD PTR [rsp+0x60],xmm1
   0x00007ffff7a637de <+46>:    movaps XMMWORD PTR [rsp+0x70],xmm2
   0x00007ffff7a637e3 <+51>:    movaps XMMWORD PTR [rsp+0x80],xmm3
   0x00007ffff7a637eb <+59>:    movaps XMMWORD PTR [rsp+0x90],xmm4
   0x00007ffff7a637f3 <+67>:    movaps XMMWORD PTR [rsp+0xa0],xmm5
   0x00007ffff7a637fb <+75>:    movaps XMMWORD PTR [rsp+0xb0],xmm6
   0x00007ffff7a63803 <+83>:    movaps XMMWORD PTR [rsp+0xc0],xmm7
   0x00007ffff7a6380b <+91>:    lea    rax,[rsp+0xe0]
   0x00007ffff7a63813 <+99>:    mov    rsi,rdi
   0x00007ffff7a63816 <+102>:   lea    rdx,[rsp+0x8]
   0x00007ffff7a6381b <+107>:   mov    QWORD PTR [rsp+0x10],rax
   0x00007ffff7a63820 <+112>:   lea    rax,[rsp+0x20]
   0x00007ffff7a63825 <+117>:   mov    DWORD PTR [rsp+0x8],0x8
   0x00007ffff7a6382d <+125>:   mov    DWORD PTR [rsp+0xc],0x30
   0x00007ffff7a63835 <+133>:   mov    QWORD PTR [rsp+0x18],rax
   0x00007ffff7a6383a <+138>:   mov    rax,QWORD PTR [rip+0x36d70f]        # 0x7ffff7dd0f50
   0x00007ffff7a63841 <+145>:   mov    rdi,QWORD PTR [rax]
   0x00007ffff7a63844 <+148>:   call   0x7ffff7a5b130 <_IO_vfprintf_internal>
   0x00007ffff7a63849 <+153>:   add    rsp,0xd8
   0x00007ffff7a63850 <+160>:   ret    
End of assembler dump.

eax上的两个mov操作（mov eax，0x0）也让我感到烦恼，因为我不能在这里扮演角色（但我更关心的是我刚刚描述的内容）。提前谢谢。

Answer 1

gcc的目标是x86-64 System V ABI，由Windows以外的所有x86-64系统使用（various historical reasons）。它的调用约定在返回堆栈之前传递寄存器中的前几个args。（另请参阅Wikipedia basic summary of this calling convention。）

是的，这与使用堆栈的所有内容的硬件旧32位调用约定不同。这是一件好事。另请参阅x86标记wiki以获取ABI文档的更多链接以及大量其他内容。

   0x0000000000400526: push   rbp
   0x0000000000400527: mov    rbp,rsp         # stack-frame boilerplate
   0x000000000040052a: mov    edi,0x4005c4    # first arg
   0x000000000040052f: mov    eax,0x0         # 0 FP args in vector registers
   0x0000000000400534: call   0x400400 <printf@plt>
   0x0000000000400539: mov    eax,0x0         # return 0.  If you'd compiled with optimization, this and the previous mov would be  xor eax,eax
   0x000000000040053e: pop    rbp             # clean up stack frame
   0x000000000040053f: ret

指向静态数据的指针适合32位，这就是为什么它可以使用mov edi, imm32代替movabs rdi, imm64。

浮点args在SSE寄存器（xmm0-xmm7）中传递，甚至传递给var-args函数。 al表示向量寄存器中有多少FP args。（请注意，C＆类型的促销规则意味着变量函数的float args总是被提升为double，这就是为什么printf没有{{1}的任何格式说明符的原因}，只有float和double）。

long double就像一个调用真正的printf函数的存根函数。

是的，没错。过程链接表条目以动态链接器例程printf@ptl开头，该例程解析符号并修改PLT中的代码，将其直接转换为jmp到libc＆s;的地址jmp定义已映射。 printf是printf的弱别名，这就是为什么gdb在您要求反汇编__printf后为该地址选择__printf标签。

printf

所以Dump of assembler code for function __printf: 0x00007ffff7a637b0 <+0>: sub rsp,0xd8 # reserve space 0x00007ffff7a637b7 <+7>: test al,al # check if there were any FP args 0x00007ffff7a637b9 <+9>: mov QWORD PTR [rsp+0x28],rsi # store the integer arg-passing registers to local scratch space 0x00007ffff7a637be <+14>: mov QWORD PTR [rsp+0x30],rdx 0x00007ffff7a637c3 <+19>: mov QWORD PTR [rsp+0x38],rcx 0x00007ffff7a637c8 <+24>: mov QWORD PTR [rsp+0x40],r8 0x00007ffff7a637cd <+29>: mov QWORD PTR [rsp+0x48],r9 0x00007ffff7a637d2 <+34>: je 0x7ffff7a6380b <__printf+91> # skip storing the FP arg-passing regs if there were no FP args 0x00007ffff7a637d4 <+36>: movaps XMMWORD PTR [rsp+0x50],xmm0 0x00007ffff7a637d9 <+41>: movaps XMMWORD PTR [rsp+0x60],xmm1 0x00007ffff7a637de <+46>: movaps XMMWORD PTR [rsp+0x70],xmm2 0x00007ffff7a637e3 <+51>: movaps XMMWORD PTR [rsp+0x80],xmm3 0x00007ffff7a637eb <+59>: movaps XMMWORD PTR [rsp+0x90],xmm4 0x00007ffff7a637f3 <+67>: movaps XMMWORD PTR [rsp+0xa0],xmm5 0x00007ffff7a637fb <+75>: movaps XMMWORD PTR [rsp+0xb0],xmm6 0x00007ffff7a63803 <+83>: movaps XMMWORD PTR [rsp+0xc0],xmm7 branch_target_from_test_je: 0x00007ffff7a6380b <+91>: lea rax,[rsp+0xe0] # some more stuff的实现通过将所有arg传递寄存器（除了第一个包含格式字符串的寄存器）存储到本地数组来保持var-args处理简单。它可以通过它们遍历指针，而不需要类似开关的代码来提取正确的整数或FP arg。它仍然需要跟踪前5个整数和前8个FP args，因为它们与调用者推入堆栈的其余args不相邻。

Windows 64位调用约定的阴影空间通过providing space for a function to dump its register args to the stack contiguous with the args already on the stack简化了这一点，但是在每次调用IMO时都不值得浪费32个字节的堆栈。（请参阅我对Why does Windows64 use a different calling convention from all other OSes on x86-64?）

上其他答案的回答和评论

Answer 2

对于printf而言，没有什么是微不足道的，不是你想要做的事情的第一选择，但结果并没有过于复杂。

更简单：

extern unsigned int more_fun ( unsigned int );
unsigned int fun ( unsigned int x )
{
    return(more_fun(x)+7);
}
0000000000000000 <fun>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   e8 00 00 00 00          callq  9 <fun+0x9>
   9:   48 83 c4 08             add    $0x8,%rsp
   d:   83 c0 07                add    $0x7,%eax
  10:   c3                      retq

并使用堆栈。 eax用于返回。

现在使用指针

extern unsigned int more_fun ( unsigned int * );
unsigned int fun ( unsigned int x )
{
    return(more_fun(&x)+7);
}
0000000000000000 <fun>:
   0:   48 83 ec 18             sub    $0x18,%rsp
   4:   89 7c 24 0c             mov    %edi,0xc(%rsp)
   8:   48 8d 7c 24 0c          lea    0xc(%rsp),%rdi
   d:   e8 00 00 00 00          callq  12 <fun+0x12>
  12:   48 83 c4 18             add    $0x18,%rsp
  16:   83 c0 07                add    $0x7,%eax
  19:   c3                      retq

然后你就像你的情况一样使用edi。

两个指针

extern unsigned int more_fun ( unsigned int *, unsigned int * );
unsigned int fun ( unsigned int x, unsigned int y )
{
    return(more_fun(&x,&y)+7);
}
0000000000000000 <fun>:
   0:   48 83 ec 18             sub    $0x18,%rsp
   4:   89 7c 24 0c             mov    %edi,0xc(%rsp)
   8:   89 74 24 08             mov    %esi,0x8(%rsp)
   c:   48 8d 7c 24 0c          lea    0xc(%rsp),%rdi
  11:   48 8d 74 24 08          lea    0x8(%rsp),%rsi
  16:   e8 00 00 00 00          callq  1b <fun+0x1b>
  1b:   48 83 c4 18             add    $0x18,%rsp
  1f:   83 c0 07                add    $0x7,%eax
  22:   c3                      retq

现在使用edi和esi。所有看起来都是我的召唤惯例......

一个字符串

extern unsigned int more_fun ( const char * );
unsigned int fun ( void  )
{
    return(more_fun("Hello World")+7);
}
0000000000000000 <fun>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   bf 00 00 00 00          mov    $0x0,%edi
   9:   e8 00 00 00 00          callq  e <fun+0xe>
   e:   48 83 c4 08             add    $0x8,%rsp
  12:   83 c0 07                add    $0x7,%eax
  15:   c3                      retq

eax没有像printf那样准备好，所以也许eax与随后的参数数量有关，尝试在printf上放置更多参数，看看eax是否会发生变化。

如果我在命令行中添加-m32，则不使用edi。

00000000 <fun>:
   0:   83 ec 18                sub    $0x18,%esp
   3:   68 00 00 00 00          push   $0x0
   8:   e8 fc ff ff ff          call   9 <fun+0x9>
   d:   83 c4 1c                add    $0x1c,%esp
  10:   83 c0 07                add    $0x7,%eax
  13:   c3

我怀疑push是占位符，链接器在链接器修补二进制文件时将地址推送到字符串，这只是一个对象。所以我的猜测是当你有一个64位指针时，前一个或两个进入寄存器然后堆栈在寄存器用完之后使用。

显然编译器工作正常，这符合编译器调用约定。

extern unsigned int more_fun ( unsigned int );
unsigned int fun ( unsigned int x )
{
    return(more_fun(x+5)+7);
}
0000000000000000 <fun>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   83 c7 05                add    $0x5,%edi
   7:   e8 00 00 00 00          callq  c <fun+0xc>
   c:   48 83 c4 08             add    $0x8,%rsp
  10:   83 c0 07                add    $0x7,%eax
  13:   c3                      retq

根据彼得的评论进行修正。是的，看来这里正在使用寄存器。

由于他提到了6个参数，让我们试试7。

extern unsigned int more_fun
(
unsigned int,
unsigned int,
unsigned int,
unsigned int,
unsigned int,
unsigned int,
unsigned int
);
unsigned int fun (
unsigned int a,
unsigned int b,
unsigned int c,
unsigned int d,
unsigned int e,
unsigned int f,
unsigned int g
)
{
    return(more_fun(a+1,b+2,c+3,d+4,e+5,f+6,g+7)+17);
}
0000000000000000 <fun>:
   0:   48 83 ec 10             sub    $0x10,%rsp
   4:   83 c1 04                add    $0x4,%ecx
   7:   83 c2 03                add    $0x3,%edx
   a:   8b 44 24 18             mov    0x18(%rsp),%eax
   e:   83 c6 02                add    $0x2,%esi
  11:   83 c7 01                add    $0x1,%edi
  14:   41 83 c1 06             add    $0x6,%r9d
  18:   41 83 c0 05             add    $0x5,%r8d
  1c:   83 c0 07                add    $0x7,%eax
  1f:   50                      push   %rax
  20:   e8 00 00 00 00          callq  25 <fun+0x25>
  25:   48 83 c4 18             add    $0x18,%rsp
  29:   83 c0 11                add    $0x11,%eax
  2c:   c3                      retq

并且确实已经从堆栈中拉出了第7个参数，并在调用之前将其放回堆栈。其他6个寄存器。

汇编 - 将参数传递给函数调用

2 个答案: