Question

我正在尝试学习nditer，以便加快我的申请速度。在这里，我尝试制作一个小型的重塑程序，它将采用20号阵列并将其重塑为5x4阵列：

myArray = np.arange(20)
def fi_by_fo_100(array):
    offset = np.array([0, 4, 8, 12, 16])
    it = np.nditer([offset, None],
                      flags=['reduce_ok'],
                      op_flags=[['readonly'],
                                ['readwrite','allocate']],
                      op_axes=[None, [0,1,-1]],
                      itershape=(-1, 4, offset.size))

    while not it.finished:
        indices = np.arange(it[0],(it[0]+4), dtype=int)
        info = array.take(indices)
        '''Just for fun, we'll perform an operation on data.\
           Let's shift it to 100'''
        info = info + 81
        it.operands[1][...]=info
        it.iternext()
    return it.operands[1]

test = fi_by_fo_100(myArray)
>>> test
array([[ 97,  98,  99, 100]])

显然，程序会将每个结果覆盖成一行。所以我尝试使用nditer的索引功能，但仍然没有骰子。

flags=['reduce_ok','c_iter'] - ＆gt; it.operands[1][it.index][...]=info =
IndexError: index out of bounds

flags=['reduce_ok','c_iter'] - ＆gt; it.operands[1][it.iterindex][...]=info =
IndexError: index out of bounds

flags=['reduce_ok','multi_iter'] - ＆gt; it.operands[1][it.multi_index][...]=info =
IndexError: index out of bounds

it[0][it.multi_index[1]][...]=info =
IndexError: 0-d arrays can't be indexed

......等等。我错过了什么？提前谢谢。

奖金问题

我刚刚遇到this nice article on nditer。我可能是Numpy的新手，但这是我第一次看到Numpy速度基准测试远远落后。我的理解是人们选择Numpy的数字速度和实力，但是迭代是其中的一部分，不是吗？如果它如此缓慢，那么nditer有什么意义？

Answer 1

通过打印出正在发生的事情，确实有助于打破局面。

首先，让我们用这个代替你的整个循环：

i = 0
while not it.finished:
    i += 1
print i

它会打印20，而不是5.这是因为你正在进行5x4迭代，而不是5x1。

那么，为什么这甚至接近工作？好吧，让我们更仔细地看一下循环：

while not it.finished:
    print '>', it.operands[0], it[0]
    indices = np.arange(it[0],(it[0]+4), dtype=int)
    info = array.take(indices)
    info = info + 81
    it.operands[1][...]=info
    print '<', it.operands[1], it[1]

您会看到前五个循环经历[0 4 8 12 16]五次，生成[[81 82 83 84]]，然后生成[[85 86 87 88]]等等。然后接下来的五个循环执行相同的操作，并且一次又一次。

这也是您的c_index解决方案不起作用的原因 - 因为it.index的范围是0到19，而it.operands[1]中没有20个。

如果你正确地执行了multi_index并且忽略了列，你可以使它工作......但是，你仍然要做5x4迭代，只是重复每个步骤4次，而不是做你想要的5x1迭代。 / p>

您的it.operands[1][...]=info每次循环都会用5x1行替换整个输出。一般来说，你不应该对it.operands[1]做任何事情 - nditer的重点是你只需照顾每个it[1]，最后的it.operands[1]是结果。

当然，对行进行5x4迭代是没有意义的。要么对单个值进行5x4迭代，要么对行进行5x1迭代。

如果你想要前者，最简单的方法是重塑输入数组，然后迭代它：

it = np.nditer([array.reshape(5, -1), None],
               op_flags=[['readonly'],
                         ['readwrite','allocate']])
for a, b in it:
    b[...] = a + 81
return it.operands[1]

但当然这很愚蠢 - 这只是一种更慢更复杂的写作方式：

return array+81

建议“编写自己reshape的方法是首先调用reshape，然后......”

，这有点愚蠢。

所以，你想迭代行，对吧？

让我们通过摆脱allocate并明确创建一个5x4数组来简化一些事情：

outarray = np.zeros((5,4), dtype=array.dtype)
offset = np.array([0, 4, 8, 12, 16])
it = np.nditer([offset, outarray],
               flags=['reduce_ok'],
               op_flags=[['readonly'],
                         ['readwrite']],
               op_axes=[None, [0]],
               itershape=[5])

while not it.finished:
    indices = np.arange(it[0],(it[0]+4), dtype=int)
    info = array.take(indices)
    '''Just for fun, we'll perform an operation on data.\
       Let's shift it to 100'''
    info = info + 81
    it.operands[1][it.index][...]=info
    it.iternext()
return it.operands[1]

这有点滥用nditer，但至少它做对了。

由于你只是在源上进行一维迭代而基本上忽略了第二次迭代，所以在这里使用nditer真的没有充分的理由。如果您需要对多个数组执行锁步迭代，for a, b in nditer([x, y], …)比迭代x并使用索引访问y更加清晰 - 就像for a, b in zip(x, y)之外的numpy一样}}。如果你需要迭代多维数组，nditer通常比替代方案更清晰。但是在这里，你所做的只是迭代[0, 4, 8, 16, 20]，对结果做一些事情，并将其复制到另一个array。

另外，正如我在评论中提到的，如果你发现自己在numpy中使用迭代，那么你通常会做错事。 numpy的所有速度优势来自于让它在本机C / Fortran或更低级别的向量操作中执行紧密循环。一旦你在array上循环，你实际上只是用一种稍微好一点的语法来做慢的Python数字：

import numpy as np
import timeit

def add10_numpy(array):
    return array + 10

def add10_nditer(array):
    it = np.nditer([array, None], [],
                   [['readonly'], ['writeonly', 'allocate']])
    for a, b in it:
        np.add(a, 10, b)
    return it.operands[1]

def add10_py(array):
    x, y = array.shape
    outarray = array.copy()
    for i in xrange(x):
        for j in xrange(y):
            outarray[i, j] = array[i, j] + 10
    return out array

myArray = np.arange(100000).reshape(250,-1)

for f in add10_numpy, add10_nditer, add10_py:
    print '%12s: %s' % (f.__name__, timeit.timeit(lambda: f(myArray), number=1))

在我的系统上，打印：

 add10_numpy: 0.000458002090454
add10_nditer: 0.292730093002
    add10_py: 0.127345085144

这显示了不必要地使用nditer的费用。

Numpy：初学者

奖金问题

1 个答案: