Question

我有一个关于64位乘法的x86程序集中的实现的问题。我已经发布了代码，只要我能够理解它。我不知道其他人做了什么（而且我可能在我已经做过的事情上犯了错误）。任何方向都会受到赞赏。

dest at %ebp+8
x    at %ebp+12
y    at %ebp+16

movl        16(%ebp), %esi      //Move y into %esi
movl        12(%ebp), %eax      //Move x into %eax
movl        %eax, %edx          //Move x into %edx
sarl        $31, %edx            //Shift x right 31 bits (only sign bit remains)
movl        20(%ebp), %ecx      //Move the low order bits of y into %ecx
imull       %eax, %ecx          //Multiply the contents of %ecx (low order bits of y) by x
movl        %edx, %ebx          //Copy sign bit of x to ebx
imull       %esi, %ebx          //Multiply sign bit of x in ebx by high order bits of y
addl        %ebx, %ecx          //Add the signed upper order bits of y to the lower order bits (What happens when this overflows?)
mull        %esi                //Multiply the contents of eax (x) by y
leal        (%ecx,%edx), %edx           
movl        8(%ebp), %ecx
movl        %eax, (%ecx)
movl        %edx, 4(%ecx)

Answer 1

这不是64位乘法（乘以一对64位数来得到128位结果）。这是32位乘法（将一对32位数相乘得到64位结果）。

32位80x86支持使用单条指令进行32位乘法运算。基本上，MUL指令将一对无符号32位数相乘以在EDX：EAX中产生无符号的64位结果;和（{1}}指令的“一个操作数”版本将一对带符号的32位数相乘以在EDX：EAX中生成带符号的64位结果。

注意：IMUL的“一个操作数”版本使用EAX中的值作为隐含的第二个操作数。

基本上;你需要将其中一个值加载到EAX中，使用IMUL一次（其中操作数是第二个值），然后存储结果。

Answer 2

以下是64位乘法算法：

x, y: 64-bit integer
x_h/x_l: higher/lower 32 bits of x
y_h/y_l: higher/lower 32 bits of y

x*y  = ((x_h*2^32 + x_l)*(y_h*2^32 + y_l)) mod 2^64
     = (x_h*y_h*2^64 + x_l*y_l + x_h*y_l*2^32 + x_l*y_h*2^32) mod 2^64
     = x_l*y_l + (x_h*y_l + x_l*y_h)*2^32

Now from the equation you can see that only 3(not 4) multiplication needed.

 movl 16(%ebp), %esi    ; get y_l
 movl 12(%ebp), %eax    ; get x_l
 movl %eax, %edx
 sarl $31, %edx         ; get x_h, (x >>a 31), higher 32 bits of sign-extension of x
 movl 20(%ebp), %ecx    ; get y_h
 imull %eax, %ecx       ; compute s: x_l*y_h
 movl %edx, %ebx
 imull %esi, %ebx       ; compute t: x_h*y_l
 addl %ebx, %ecx        ; compute s + t
 mull %esi              ; compute u: x_l*y_l
 leal (%ecx,%edx), %edx ; u_h += (s + t), result is u
 movl 8(%ebp), %ecx
 movl %eax, (%ecx)
 movl %edx, 4(%ecx)

您还可以查看implement 64-bit arithmetic on a 32-bit machine

汇编：64位乘法与32位寄存器

2 个答案: