我将只包括部分生成的代码。 汇编:
mov r15, 10000000 ; 10 millions
now ; get time
lbl:
dec r15
include "temp_code.asm"
cmp r15, 0
jne lbl
now
section '.data' data readable writeable
include "temp_data.asm"
temp_code.asm包含
mov rbx, 0
mov rax, [numbers0 + 0 * 8]
mov rcx, [numbers0 + 1 * 8]
imul rax, rcx
add rbx, rax
mov rax, [numbers0 + 2 * 8]
mov rcx, [numbers0 + 3 * 8]
imul rax, rcx
add rbx, rax
...
mov rax, [numbers99 + 18 * 8]
mov rcx, [numbers99 + 19 * 8]
imul rax, rcx
add rbx, rax
mov rax, rbx
总共4200行,对应于100行python。
温度数据包含
numbers0 dq 103,253,479,962,468,91,543,382,761,923,292,696,255,35,726,141,282,260,727,110
...
numbers99 dq 445,543,544,833,136,474,12,337,652,34,68,916,184,839,263,373,590,342,214,984
这些是0到1000之间的随机数
对应的pypy代码:
def f():
temp0 = now()
# m is array containing 2000 random numbers
for i in range(100000): 100 thousand
m[0]*m[1]+m[2]*m[3]+m[4]*m[5]+m[6]*m[7]+m[8]*m[9]+ m[10]*m[11]+m[12]*m[13]+m[14]*m[15]+m[16]*m[17]+m[18]*m[19]
...
m[1980]*m[1981]+m[1982]*m[1983]+m[1984]*m[1985]+m[1986]*m[1987]+m[1988]*m[1989]+m[1990]*m[1991]+m[1992]*m[1993]+m[1994]*m[1995]+m[1996]*m[1997]+m[1998]*m[1999]
temp1 = now()
print(temp1 - temp0)
Asm代码运行6.755秒,pypy-30.109秒,因此PyPy慢446倍(是的,446-迭代100000次,而asm中为10000000)。我简直不敢相信我的眼睛。这是怎么回事?
编辑:我为动态语言编写的天真解释器的运行速度也比编写它的python慢260倍。但是CPython(运行此基准测试的时间为8.978秒-比PyPy快3.35倍)是半编译器-它可以编译为字节码。