我想根据一些布尔条件(级联,一个接一个)更改多维numpy数组(比如说mydata
)。
这有效:
mydata[condition] = something
这不是:
mydata[condition1][condition2] = something
其中所有条件都是兼容形状的布尔数组(brodcast-able)。 任何理由都没有,什么可能是一个好的解决方案?现在,我通过以下方式重新分配给原文来解决它:
tempdata = mydata[condition1]
tempdata[condition2] = something
mydata[condition1] = tempdata
答案 0 :(得分:2)
要解决此类案例,请使用 chained / cascaded integer-indexing
-
idx1 = np.flatnonzero(condition1)
idx2 = np.flatnonzero(condition2)
mydata[idx1[idx2]] = something
示例运行 -
In [42]: mydata = np.array([2,6,8,0,9,3,1,4])
...: mydata_copy = mydata.copy() # make copy for verification
...: condition1 = np.array([True,False,True,True,True,False,False,True])
...: condition2 = np.array([False,True,False,True,True])
...: something = -1
...:
# Working solution from question
In [43]: tempdata = mydata[condition1]
...: tempdata[condition2] = something
...: mydata[condition1] = tempdata
...:
In [44]: mydata # Check changed values
Out[44]: array([ 2, 6, -1, 0, -1, 3, 1, -1])
# Proposed solution
In [45]: idx1 = np.flatnonzero(condition1)
...: idx2 = np.flatnonzero(condition2)
...: mydata_copy[idx1[idx2]] = something
...:
In [46]: mydata_copy # Verify changed values in copy
Out[46]: array([ 2, 6, -1, 0, -1, 3, 1, -1])
替代方法:或者,如果您不介意编辑condition1
,可以这样做 -
condition1[idx1] = condition2
然后使用mydata[condition1] = something
作为最后一步。
效益
让我们给出建议的时间,看看问题中是否有任何好处。
方法 -
# Original approach
def org_app(mydata,condition1,condition2):
tempdata = mydata[condition1]
tempdata[condition2] = something
mydata[condition1] = tempdata
return mydata
# Proposed one
def proposed_app(mydata,condition1,condition2):
idx1 = np.flatnonzero(condition1)
idx2 = np.flatnonzero(condition2)
mydata[idx1[idx2]] = something
return mydata
计时 -
In [58]: mydata = np.random.rand(1000000)
...: mydata_copy = mydata.copy()
...: condition1 = np.random.rand(mydata.size)>0.5
...: condition2 = np.random.rand(condition1.sum())>0.5
...: something = -1
...:
In [59]: %timeit org_app(mydata,condition1,condition2)
100 loops, best of 3: 14.1 ms per loop
In [61]: %timeit proposed_app(mydata_copy,condition1,condition2)
100 loops, best of 3: 7.44 ms per loop
合并Alternative method
应该会带来进一步的性能提升。