假设我有一个100000 x 100的矩阵
import numpy as np
mat = np.random.randint(2, size=(100000,100))
我希望遍历此矩阵,如果每个row
完全包含1或0,则希望将state
变量更改为该值。如果状态未更改,则希望将整个row
的值设置为state
。 state
的初始值为0。
天真地在for
循环中,可以按以下步骤完成
state = 0
for row in mat:
if set(row) == {1}:
state = 1
elif set(row) == {0}:
state = 0
else:
row[:] = state
但是,当矩阵的大小增加时,这将花费不切实际的时间。有人可以向我指出如何利用numpy
来向量化此循环并加快循环速度的方向吗?
所以对于示例输入
array([[0, 1, 0],
[0, 0, 1],
[1, 1, 1],
[0, 0, 1],
[0, 0, 1]])
在这种情况下,预期输出为
array([[0, 0, 0],
[0, 0, 0],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
答案 0 :(得分:2)
方法1:NumPy矢量化
这是矢量化的-
def check_all(a, state): # a is input matrix/array
# Get zeros and ones all masks
zm = (a==0).all(1)
om = (a==1).all(1)
# "Attach" boundaries with False values at the start of these masks.
# These will be used to detect rising edges (as indices) on these masks.
zma = np.r_[False,zm]
oma = np.r_[False,om]
omi = np.flatnonzero(oma[:-1] < oma[1:])
zmi = np.flatnonzero(zma[:-1] < zma[1:])
# Group the indices and the signatures (values as 1s and -1s)
ai = np.r_[omi,zmi]
av = np.r_[np.ones(len(omi),dtype=int),-np.ones(len(zmi),dtype=int)]
# Sort the grouped-indices, thus we would know the positions
# of these group starts. Then index into the signatures/values
# and indices with those, giving us the information on how these signatures
# occur through the length of the input
sidx = ai.argsort()
val,aidx = av[sidx],ai[sidx]
# The identical consecutive signatures are to be removed
mask = np.r_[True,val[:-1]!=val[1:]]
v,i = val[mask],aidx[mask]
# Also, note that we are assigning all 1s as +1 signature and all 0s as -1
# So, in case the starting signature is a 0, assign a value of 0
if v[0]==-1:
v[0] = 0
# Initialize 1D o/p array, which stores the signatures as +1s and -1s.
# The bigger level idea is that performing cumsum at the end would give us the
# desired 1D output
out1d = np.zeros(len(a),dtype=a.dtype)
# Assign the values at i positions
out1d[i] = v
# Finally cumsum to get desired output
out1dc = out1d.cumsum()
# Correct the starting positions based on starting state value
out1dc[:i[0]] = state
# Convert to 2D view for mem. and perf. efficiency
out = np.broadcast_to(out1dc[:,None],a.shape)
return out
方法2:基于Numba
这是另一个基于numba的内存和perf。效率-
@njit(parallel=True)
def func1(zm, om, out, start_state, cur_state):
# This outputs 1D version of required output.
# Start off with the starting given state
newval = start_state
# Loop through zipped zeros-all and ones-all masks and in essence do :
# Switch between zeros and ones based on whether the other ones
# are occuring through or not, prior to the current state
for i,(z,o) in enumerate(zip(zm,om)):
if z and cur_state:
cur_state = ~cur_state
newval = 0
if o and ~cur_state:
cur_state = ~cur_state
newval = 1
out[i] = newval
return out
def check_all_numba(a, state):
# Get zeros and ones all masks
zm = (a==0).all(1)
om = (a==1).all(1)
# Decide the starting state
cur_state = zm.argmax() < om.argmax()
# Initialize 1D o/p array with given state values
out1d = np.full(len(a), fill_value=state)
func1(zm, om, out1d, state, cur_state)
# Broadcast into the 2D view for memory and perf. efficiency
return np.broadcast_to(out1d[:,None],a.shape)
答案 1 :(得分:1)
您可以利用np.accumulate来做到这一点而无需循环:
R = 5 # 100000
C = 3 # 100
mat = np.random.randint(2, size=(R,C))
print(mat) # original matrix
state = np.zeros((1,C)) # or np.ones((1,C))
mat = np.concatenate([state,mat]) # insert state row
zRows = np.isin(np.sum(mat,1),[0,C]) # all zeroes or all ones
iRows = np.arange(R+1) * zRows.astype(np.int) # base indexes
mat = mat[np.maximum.accumulate(iRows)][1:] # indirection, remove state
print(mat) # modified
#original
[[0 0 1]
[1 1 1]
[1 0 1]
[0 0 0]
[1 0 1]]
# modified
[[0 0 0]
[1 1 1]
[1 1 1]
[0 0 0]
[0 0 0]]
它的工作方式是为需要更改的行准备一个间接数组。这是通过np.arange行索引完成的,其中将需要替换的索引设置为零。累积最大索引会将每个替换行映射到它之前的全零或全一行。
例如:
[ 0, 1, 2, 3, 4, 5 ] # row indexes
[ 0, 1, 0, 0, 1, 0 ] # rows that are all zeroes or all ones (zRows)
[ 0, 1, 0, 0, 4, 0 ] # multiplied (iRows)
[ 0, 1, 1, 1, 4, 4 ] # np.maximum.accumulate
这为我们提供了应从中获取行内容的索引列表。
状态由执行操作之前在矩阵的开头插入并在之后删除的额外行表示。
对于很小的矩阵(5x3),此解决方案的速度会稍慢一些,但对于较大的矩阵(100000x100:0.7秒vs 14秒),它可以使速度提高20倍。
答案 2 :(得分:0)
这是一个简单快速的numpy方法:
import numpy as np
def pp():
m,n = a.shape
A = a.sum(axis=1)
A = np.where((A==0)|(A==n))[0]
if not A.size:
return np.ones_like(a) if state else np.zeros_like(a)
st = np.concatenate([np.arange(A[0]!=0), A, [m]])
v = a[st[:-1],0]
if A[0]:
v[0] = state
return np.broadcast_to(v.repeat(st[1:]-st[:-1])[:,None],(m,n))
我以此为准
state=0
a = (np.random.random((100000,100))<np.random.random((100000,1))).astype(int)
简单的测试用例:
0.8655898020006134 # me
4.089095343002555 # Alain T.
2.2958932030014694 # Divakar 1
2.2178015549980046 # & 2