Question

我正在理解汇编和c编程的基础知识。

我在C编译了以下简单程序，

#include <stdio.h>

int main()
{
  int a;
  int b;
  a = 10;
  b = 88

  return 0;
}

使用以下命令编译，

gcc -ggdb -fno-stack-protector test.c -o test

使用gcc版本4.4.7的上述程序的反汇编代码是：

5                      push   %ebp
89 e5                   mov    %esp,%ebp
83 ec 10                sub    $0x10,%esp
c7 45 f8 0a 00 00 00    movl   $0xa,-0x8(%ebp)
c7 45 fc 58 00 00 00    movl   $0x58,-0x4(%ebp)
b8 00 00 00 00          mov    $0x0,%eax
c9                      leave
c3                      ret
90                      nop

然而，使用gcc版本4.3.3的相同程序的反汇编代码是：

8d 4c 23 04     lea     0x4(%esp), %ecx
83 e4 f0        and     $0xfffffff0, %esp
55              push    -0x4(%ecx)
89 e5           mov     %esp,%ebp
51              push     %ecx
83 ec 10        sub      $0x10,%esp
c7 45 f4 0a 00 00 00 00 movl $0xa, -0xc(%ebp)
c7 45 f8 58 00 00 00 00 movl $0x58, -0x8(%ebp)
b8 00 00 00 00          mov $0x0, %eax
83 c4 10                add $0x10,%esp
59                      pop %ecx
5d                      pop %ebp
8d 61 fc                lea -0x4(%ecx),%esp
c3                      ret

为什么汇编代码有差异？
正如您在第二个汇编代码中看到的那样，为什么要在堆栈上推送％ecx？ and $0xfffffff0, %esp的重要性是什么？

注意：操作系统相同

Answer 1

编译器不需要为相同的源代码生成相同的汇编代码。只要可观察行为相同，C标准允许编译器根据需要优化代码。因此，不同的编译器可能会生成不同的汇编代码。

对于您的代码，带-O3的{{3}}只生成：

xor     eax, eax
ret

因为你的代码基本上什么都不做。所以，它简化为简单的退货声明。

Answer 2

To give you some idea, how many ways exists to create valid code for particular task, I thought this example may help.

From time to time there are size coding competitions, obviously targetting Assembly programmers, as you can't compete with compiler against hand written assembly at this level at all.

The competition tasks are fairly trivial to make the entry level and total effort reasonable, with precise input and output specifications (down to single byte or pixel perfection).

So you have almost trivial exact task, human produced code (at the moment still outperforming compilers for trivial task), with single simple rule "minimal size" as a goal.

With your logic it's absolutely clear every competitor should produce the same result.

The real world answer to this is for example:

Hugi Size Coding Competition Series - Compo29 - Random Maze Builder

12 entries, size of code (in bytes): 122, 122, 128, 135, 136, 137, 147, ... 278 (!).

And I bet the first two entries, both having 122B are probably different enough (too lazy to actually check them).

Now producing valid machine code from high level programming language and by machine (compiler) is lot more complex task. And compilers can't compete with humans in reasoning, most of the "how good code is produced by c++ compiler" stems from C++ language itself being defined quite close to machine code (easy to compile) and from brute CPU power allowing the compilers to work on thousands of variants for particular code path, searching for near-optimal solution mostly by brute force.

Still the numerical "reasoning" behind the optimizers are state of art in their own way, getting to the point where human are still unreachable, but more like in their own way, just as humans can't achieve the efficiency of compilers within reasonable effort for full-sized app compilation.

At this point reasoning about some debug code being different in few helper prologue/epilogue instructions... Even if you would find difference in optimized code, and the difference being "obvious" to human, it's still quite a feat the compiler can produce at least that, as compiler has to apply universal rules on specific code, without truly understanding the context of task.

为什么汇编代码对于具有不同gcc版本的简单C程序是不同的？

2 个答案: