我有很多这些矩阵数组,其中我想用最低索引的非零条目替换0条目。这可以使用for循环轻松完成:
import numpy as np
input_array = np.array([ 0.01561, 0.01561, 0.02039, 0.02039, 0.02776, 0.02776,
0.03997, 0., 0.03997, 0.06243, 0., 0., 0.0624662,
0.11105, 0., 0., 0., 0.11105, 0.24986,
0., 0., 0., 0., 0., 0.,
0.24986])
for i in range(0,len(input_array)) :
if input_array[i] == 0 :
input_array[i] = input_array[i-1]
有人建议我是否值得努力?
答案 0 :(得分:1)
将numpy解决方案应用于:
Most efficient way to forward-fill NaN values in numpy array
def foo2(arr):
idx=np.where(arr==0,0,np.arange(len(arr)))
idx=np.maximum.accumulate(idx)
return arr[idx]
def foo1(arr):
arr = arr.copy()
for i in range(len(arr)):
if arr[i]==0:
arr[i] = arr[i-1]
return arr
对于您的测试阵列arr
,速度提升是适度的:
In [67]: timeit foo1(arr)
100000 loops, best of 3: 18.1 µs per loop
In [68]: timeit foo2(arr)
The slowest run took 1387.12 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 11.4 µs per loop
但是有了更大的一个,循环随着大小而增加,阵列版本几乎没有变化:
In [69]: arr1=np.concatenate((arr,arr,arr,arr,arr,arr,arr))
In [70]: timeit foo1(arr1)
10000 loops, best of 3: 116 µs per loop
In [71]: timeit foo2(arr1)
The slowest run took 4.16 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 14.6 µs per loop
idx
构造的详细信息:
In [72]: idx=np.arange(len(arr))
In [73]: idx[arr==0]=0
In [74]: idx
Out[74]:
array([ 0, 1, 2, 3, 4, 5, 6, 0, 8, 9, 0, 0, 12, 13, 0, 0, 0, 17, 18, 0, 0, 0, 0, 0, 0, 25])
In [75]: idx=np.maximum.accumulate(idx)
In [76]: idx
Out[76]:
array([ 0, 1, 2, 3, 4, 5, 6, 6, 8, 9, 9, 9, 12, 13, 13, 13, 13, 17, 18, 18, 18, 18, 18, 18, 18, 25], dtype=int32)