我想用cumsum
数组做非零numpy
。只需在数组中跳过零并应用cumsum
即可。假设我有一个np。数组
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
我的结果应该是
[1,3,4,6,11,0,20,26,0,28,31,0]
我试过这个
a = np.cumsum(a[a!=0])
但结果是
[1,3,4,6,11,20,26,28,31]
有什么想法吗?
答案 0 :(得分:2)
您需要屏蔽原始数组,以便只覆盖非零元素:
In [9]:
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
a[a!=0] = np.cumsum(a[a!=0])
a
Out[9]:
array([ 1, 3, 4, 6, 11, 0, 20, 26, 0, 28, 31, 0])
另一种方法是使用np.where
:
In [93]:
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
a = np.where(a!=0,np.cumsum(a),a)
a
Out[93]:
array([ 1, 3, 4, 6, 11, 0, 20, 26, 0, 28, 31, 0])
<强>定时强>
In [91]:
%%timeit
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
a[a!=0] = np.cumsum(a[a!=0])
a
The slowest run took 4.93 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 12.6 µs per loop
In [94]:
%%timeit
a = np.array([1,2,1,2,5,0,9,6,0,2,3,0])
a = np.where(a!=0,np.cumsum(a),a)
a
The slowest run took 6.00 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 10.5 µs per loop
以上显示np.where
比第一种方法
答案 1 :(得分:1)
在我看来,jotasi在对OP的评论中提出的建议是最惯用的。这里有一些时间,但请注意Shawn。 L''s答案返回一个Python列表,而不是NumPy数组,因此它们不具有严格的可比性。
import numpy as np
def jotasi(a):
b = np.cumsum(a)
b[a==0] = 0
return b
def EdChum(a):
a[a!=0] = np.cumsum(a[a!=0])
return a
def ShawnL(a):
b=np.cumsum(a)
b = [b[i] if ((i > 0 and b[i] != b[i-1]) or i==0) else 0 for i in range(len(b))]
return b
def Ed2(a):
return np.where(a!=0,np.cumsum(a),a)
为了测试,我在[0,100]中生成了一个1E5整数的NumPy数组。因此,大约1%是0.这些结果来自NumPy 1.9.2,Python 2.7.12,并且从最慢到最快呈现:
import timeit
a = np.random.random_integers(0,100,100000)
len(a[a==0]) #verify there are some 0's
1003
timeit.timeit("ShawnL(a)", "from __main__ import a,EdChum,ShawnL,jotasi,Ed2", number=250)
11.743098020553589
timeit.timeit("EdChum(a)", "from __main__ import a,EdChum,ShawnL,jotasi,Ed2", number=250)
0.1794271469116211
timeit.timeit("Ed2(a)", "from __main__ import a,EdChum,ShawnL,jotasi,Ed2", number=250)
0.1282949447631836
timeit.timeit("jotasi(a)", "from __main__ import a,EdChum,ShawnL,jotasi,Ed2", number=250)
0.09286999702453613
我有点惊讶的是,jotasi和Ed Chum的答案之间有如此大的差异 - 我猜想最小化布尔运算是显而易见的。毫无疑问,列表理解很慢。
答案 2 :(得分:0)
试图简化它:)
b=np.cumsum(a)
[b[i] if ((i > 0 and b[i] != b[i-1]) or i==0) else 0 for i in range(len(b))]