假设我有一个像x[p:-q:n]
或x[::n]
的切片我想用它来生成要传递到numpy.ufunc.reduceat(x, [p, p + n, p + 2 * n, ...])
或numpy.ufunc.reduceat(x, [0, n, 2 * n, ...])
的索引。完成任务的最简单有效的方法是什么?
答案 0 :(得分:3)
以评论为基础:
In [351]: x=np.arange(100)
In [352]: np.r_[0:100:10]
Out[352]: array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])
In [353]: np.add.reduceat(x,np.r_[0:100:10])
Out[353]: array([ 45, 145, 245, 345, 445, 545, 645, 745, 845, 945], dtype=int32)
In [354]: np.add.reduceat(x,np.arange(0,100,10))
Out[354]: array([ 45, 145, 245, 345, 445, 545, 645, 745, 845, 945], dtype=int32)
In [355]: np.add.reduceat(x,list(range(0,100,10)))
Out[355]: array([ 45, 145, 245, 345, 445, 545, 645, 745, 845, 945], dtype=int32)
In [356]: x.reshape(-1,10).sum(axis=1)
Out[356]: array([ 45, 145, 245, 345, 445, 545, 645, 745, 845, 945])
和时间:
In [357]: timeit np.add.reduceat(x,np.r_[0:100:10])
The slowest run took 9.30 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 31.2 µs per loop
In [358]: timeit np.add.reduceat(x,np.arange(0,100,10))
The slowest run took 85.75 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.69 µs per loop
In [359]: timeit np.add.reduceat(x,list(range(0,100,10)))
The slowest run took 4.31 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 11.9 µs per loop
In [360]: timeit x.reshape(-1,10).sum(axis=1)
The slowest run took 5.57 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 11.5 µs per loop
带有reduceat
的 arange
看起来效果最好,但应该在更实际的数据上进行测试。在这个规模上,速度并没有那么不同。
r_
的值是它允许您使用方便的切片符号;它位于名为index_tricks.py
的文件中。
10000个元素x
,时间分别为80,46,238,51。