Question

我是NumPy的新手并尝试过教科书代码。不幸的是，在一定程度的计算中，NumPy结果搞砸了。这是代码：

import sys
from datetime import datetime
import numpy

def pythonsum(n):
    a = range(n)
    b = range(n)
    c = []
    for i in range(len(a)):
        a[i] = i**2
        b[i] = i**3
        c.append(a[i]+b[i])
    return c

def numpysum(n):
    a = numpy.arange(n) ** 2
    b = numpy.arange(n) ** 3
    c = a + b
    return c

size = int(sys.argv[1])
start = datetime.now()
c=pythonsum(size)
delta = datetime.now()-start
print "The last 2 elements of the sum",c[-2:]
print "PythonSum elapsed time in microseconds", delta.microseconds
start = datetime.now()
c=numpysum(size)
delta = datetime.now()-start
print "The last 2 elements of the sum",c[-2:]
print "NumPySum elapsed time in microseconds", delta.microseconds

当尺寸> = 1291时，结果变为负值我正在使用python 2.6，MacOSX 10.6，NumPy 1.5.0 有什么想法吗？

Answer 1

开始Numpy 1.5？

“动作时间 - 添加向量”中的介绍示例仅在允许长整数的64位平台上运行。否则它将返回错误的结果：

The last 2 elements of the sum [-2143491644 -2143487647]

要解决此问题，请将power函数中的整数转换为float，以便转发浮动值。结果：加速10倍

$ python vectorsum.py 1000000

总和的最后2个元素[9.99995000008e + 17,9.99998000001e + 17]

PythonSum经过的时间，以微秒为单位3 59013

总和的最后2个元素[9.99993999e + 17 9.99996999e + 17]

NumPySum经过的时间，以微秒为单位0 308598

更正的示例：

导入sys

来自datetime import datetime

import numpy

def numpysum（n）：
a = numpy.arange(n) ** 2.

b = numpy.arange(n) ** 3.

c = a + b

return c
def pythonsum（n）： a = range（n）
  b = range(n)

  c = []

  for i in range(len(a)):

      a[i] = i ** 2.     # notice the dot (!)

      b[i] = i ** 3.

      c.append(a[i] + b[i])

  return c
size = int（sys.argv [1]）

start = datetime.now（）

c = pythonsum（size）

delta = datetime.now（） - start

print“总和的最后两个元素”，c [-2：]

print“PythonSum经过的时间，以微秒为单位”，delta.seconds，   delta.microseconds

start = datetime.now（）

c = numpysum（大小）

delta = datetime.now（） - start

print“总和的最后两个元素”，c [-2：]

print“NumPySum经过的时间，以微秒为单位”，delta.seconds，delta.microseconds

代码在pastebin http://paste.ubuntu.com/1169976/

中可用

Answer 2

我认为这个帖子有些混乱。纯Python（即非numpy）代码工作的原因与32位与64位没有任何关系。它可以正常工作：Python int s可以是任意大小。 [背景中有一些实现细节涉及它是否称为int或long，但您不必担心它，转换是无缝的。这就是为什么有时你会在数字的末尾看到L。]

例如：

>>> 2**100
1267650600228229401496703205376L

另一方面，numpy整数dtypes是固定精度的，并且始终会因足够大的数字而失败，无论宽度如何：

>>> for kind in numpy.int8, numpy.int16, numpy.int32, numpy.int64:
...     for power in 1, 2, 5, 20:
...         print kind, power, kind(10), kind(10)**power
... 
<type 'numpy.int8'> 1 10 10
<type 'numpy.int8'> 2 10 100
<type 'numpy.int8'> 5 10 100000
<type 'numpy.int8'> 20 10 -2147483648
<type 'numpy.int16'> 1 10 10
<type 'numpy.int16'> 2 10 100
<type 'numpy.int16'> 5 10 100000
<type 'numpy.int16'> 20 10 -2147483648
<type 'numpy.int32'> 1 10 10
<type 'numpy.int32'> 2 10 100
<type 'numpy.int32'> 5 10 100000
<type 'numpy.int32'> 20 10 1661992960
<type 'numpy.int64'> 1 10 10
<type 'numpy.int64'> 2 10 100
<type 'numpy.int64'> 5 10 100000
<type 'numpy.int64'> 20 10 7766279631452241920

你可以从numpy获得与纯Python相同的结果，告诉它使用Python类型，即dtype=object，尽管性能受到重大影响：

>>> import numpy
>>> numpy.array([10])
array([10])
>>> numpy.array([10])**100
__main__:1: RuntimeWarning: invalid value encountered in power
array([-2147483648])
>>> numpy.array([10], dtype=object)
array([10], dtype=object)
>>> numpy.array([10], dtype=object)**100
array([ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], dtype=object)

numpy初学者数组普通python与numpy向量：错误的结果

2 个答案: