我是NumPy的新手并尝试过教科书代码。不幸的是,在一定程度的计算中,NumPy结果搞砸了。这是代码:
import sys
from datetime import datetime
import numpy
def pythonsum(n):
a = range(n)
b = range(n)
c = []
for i in range(len(a)):
a[i] = i**2
b[i] = i**3
c.append(a[i]+b[i])
return c
def numpysum(n):
a = numpy.arange(n) ** 2
b = numpy.arange(n) ** 3
c = a + b
return c
size = int(sys.argv[1])
start = datetime.now()
c=pythonsum(size)
delta = datetime.now()-start
print "The last 2 elements of the sum",c[-2:]
print "PythonSum elapsed time in microseconds", delta.microseconds
start = datetime.now()
c=numpysum(size)
delta = datetime.now()-start
print "The last 2 elements of the sum",c[-2:]
print "NumPySum elapsed time in microseconds", delta.microseconds
当尺寸> = 1291时,结果变为负值 我正在使用python 2.6,MacOSX 10.6,NumPy 1.5.0 有什么想法吗?
答案 0 :(得分:1)
开始Numpy 1.5?
“动作时间 - 添加向量”中的介绍示例仅在允许长整数的64位平台上运行。否则它将返回错误的结果:
The last 2 elements of the sum [-2143491644 -2143487647]
要解决此问题,请将power函数中的整数转换为float,以便转发浮动值。 结果:加速10倍
$ python vectorsum.py 1000000
总和的最后2个元素[9.99995000008e + 17,9.99998000001e + 17]
PythonSum经过的时间,以微秒为单位3 59013
总和的最后2个元素[9.99993999e + 17 9.99996999e + 17]
NumPySum经过的时间,以微秒为单位0 308598
更正的示例:
导入sys
来自datetime import datetime
import numpy
def numpysum(n):
a = numpy.arange(n) ** 2. b = numpy.arange(n) ** 3. c = a + b return c
def pythonsum(n): a = range(n)
b = range(n) c = [] for i in range(len(a)): a[i] = i ** 2. # notice the dot (!) b[i] = i ** 3. c.append(a[i] + b[i]) return c
size = int(sys.argv [1])
start = datetime.now()
c = pythonsum(size)
delta = datetime.now() - start
print“总和的最后两个元素”,c [-2:]
print“PythonSum经过的时间,以微秒为单位”,delta.seconds, delta.microseconds
start = datetime.now()
c = numpysum(大小)
delta = datetime.now() - start
print“总和的最后两个元素”,c [-2:]
print“NumPySum经过的时间,以微秒为单位”,delta.seconds,delta.microseconds
代码在pastebin http://paste.ubuntu.com/1169976/
中可用答案 1 :(得分:0)
我认为这个帖子有些混乱。纯Python(即非numpy
)代码工作的原因与32位与64位没有任何关系。它可以正常工作:Python int
s可以是任意大小。 [背景中有一些实现细节涉及它是否称为int
或long
,但您不必担心它,转换是无缝的。这就是为什么有时你会在数字的末尾看到L
。]
例如:
>>> 2**100
1267650600228229401496703205376L
另一方面,numpy
整数dtypes
是固定精度的,并且始终会因足够大的数字而失败,无论宽度如何:
>>> for kind in numpy.int8, numpy.int16, numpy.int32, numpy.int64:
... for power in 1, 2, 5, 20:
... print kind, power, kind(10), kind(10)**power
...
<type 'numpy.int8'> 1 10 10
<type 'numpy.int8'> 2 10 100
<type 'numpy.int8'> 5 10 100000
<type 'numpy.int8'> 20 10 -2147483648
<type 'numpy.int16'> 1 10 10
<type 'numpy.int16'> 2 10 100
<type 'numpy.int16'> 5 10 100000
<type 'numpy.int16'> 20 10 -2147483648
<type 'numpy.int32'> 1 10 10
<type 'numpy.int32'> 2 10 100
<type 'numpy.int32'> 5 10 100000
<type 'numpy.int32'> 20 10 1661992960
<type 'numpy.int64'> 1 10 10
<type 'numpy.int64'> 2 10 100
<type 'numpy.int64'> 5 10 100000
<type 'numpy.int64'> 20 10 7766279631452241920
你可以从numpy
获得与纯Python相同的结果,告诉它使用Python类型,即dtype=object
,尽管性能受到重大影响:
>>> import numpy
>>> numpy.array([10])
array([10])
>>> numpy.array([10])**100
__main__:1: RuntimeWarning: invalid value encountered in power
array([-2147483648])
>>> numpy.array([10], dtype=object)
array([10], dtype=object)
>>> numpy.array([10], dtype=object)**100
array([ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], dtype=object)