Question

给出以下代码：

typedef struct tagRECT {
  int left;
  int top;
  int right;
  int bottom;
} RECT;

extern int Func(RECT *a, int b, char *c, int d, char e, long f, int g, int h, int i, int j);

int main() {

}

void gui() {
    RECT x = {4, 5, 6, 7};
    Func(&x, 1, 0, 3, 4, 5, 6, 7, 8, 9);
}

这是大概在linux上由程序集生成的gcc x86_64（我使用了compiler explorer）。

main:
  mov eax, 0
  ret
gui:
  push rbp
  mov rbp, rsp
  sub rsp, 16
  ; RECT x assignment
  mov DWORD PTR [rbp-16], 4
  mov DWORD PTR [rbp-12], 5
  mov DWORD PTR [rbp-8], 6
  mov DWORD PTR [rbp-4], 7

  ; parameters
  lea rax, [rbp-16]
  push 9
  push 8
  push 7
  push 6
  mov r9d, 5
  mov r8d, 4
  mov ecx, 3
  mov edx, 0
  mov esi, 1
  mov rdi, rax
  call Func
  add rsp, 32
  nop
  leave
  ret

可以看出，结构中的int对齐了4个字节。但是该函数的最后4个参数，所有int被push到堆栈中，这意味着它们被对齐了8个字节。为什么会有这种不一致？

Answer 1

在x86-64调用约定（如您正在使用的x86-64 System V调用约定）中，

堆栈插槽为8个字节，因为32位push / pop是不可能的，并且使keep it 16-byte aligned更容易使用。请参见What are the calling conventions for UNIX & Linux system calls on i386 and x86-64（还介绍了函数调用约定和系统调用约定。Where is the x86-64 System V ABI documented?。

mov可以正常工作，因此将4个字节作为堆栈args的最小单位是一种有效的设计。（与x86-16不同，在后者中SP相对寻址模式是不可能的）。 但是除非您引入填充规则，否则您可能会错位8字节的args。因此，给每个arg至少8字节的对齐可能是动机的一部分。（尽管有填充规则来确保__m128 args具有16字节对齐，而__m256具有32字节等等。并且大概也适用于struct { alignas(64) char b[256]; };之类的过度对齐的结构。

对于没有原型的函数，只有4字节的插槽会更容易中断，并且可能使可变参数的函数变得更复杂，但是x86-64 System V已经按堆栈上的值传递了较大的对象，因此堆栈arg可能占用多个对象8字节的“堆栈插槽”。

（不同于Windows x64，它通过隐藏引用传递，因此每个arg都是一个堆栈槽。它甚至保留32字节的影子空间，因此可变参数函数可以将其寄存器arg溢出到影子空间中，并创建所有args。）

结构与参数中数据对齐方式的差异？

1 个答案: