我得到了下面的小numpy矩阵,矩阵的值只能是0或1.我使用的实际矩阵的大小实际上要大得多,但为了演示目的,这个可以。它的形状是(8, 11)
np_array = np.matrix(
[[0,0,0,0,1,0,0,0,0,0,0],
[0,0,0,1,0,1,0,0,0,0,0],
[0,0,0,1,0,1,0,0,0,0,0],
[0,0,1,0,0,1,1,0,0,0,0],
[0,0,1,0,0,0,1,0,0,0,0],
[0,1,0,0,0,0,1,1,0,1,1],
[0,1,0,0,0,0,0,1,0,1,0],
[1,0,0,0,0,0,0,1,1,1,0]]
)
我需要以这样的方式更改它,以便每列只应该有一行值为1.因此,如果同一列中有更多行值为1,则最高行为值保留1,其余的替换为0。 以下是我追求的结果:
np_array1 = np.matrix(
[[0,0,0,0,1,0,0,0,0,0,0],
[0,0,0,1,0,1,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0,0,0],
[0,0,1,0,0,0,1,0,0,0,0],
[0,0,0,0,0,0,0,0,0,0,0],
[0,1,0,0,0,0,0,1,0,1,1],
[0,0,0,0,0,0,0,0,0,0,0],
[1,0,0,0,0,0,0,0,1,0,0]]
)
基本上每列可以有一个值1,如果有多行,则保持最高的一行。我必须提一下,也可以列中没有任何行具有值1.这些列必须保持不变。矩阵的形状必须与变换前的形状完全一致。
答案 0 :(得分:3)
这是一种方法 -
def per_col(a):
idx = a.argmax(0)
out = np.zeros_like(a)
r = np.arange(a.shape[1])
out[idx, r] = a[idx, r]
return out
示例运行
案例#1:
In [41]: a
Out[41]:
array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0]])
In [42]: per_col(a)
Out[42]:
array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]])
案例#2(插入全零列):
In [78]: a[:,1] = 0
In [79]: a
Out[79]:
array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0]])
In [80]: per_col(a)
Out[80]:
array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]])
如果你对单行或者broadcasting
的粉丝感到疯狂,那么这是另一个 -
((a.argmax(0) == np.arange(a.shape[0])[:,None]).astype(int))*a.any(0)
示例运行 -
In [89]: a
Out[89]:
array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0]])
In [90]: ((a.argmax(0) == np.arange(a.shape[0])[:,None]).astype(int))*a.any(0)
Out[90]:
array([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]])
运行时测试 -
In [98]: a = np.random.randint(0,2,(100,10000))
# @DSM's soln
In [99]: %timeit ((a == 1) & (a.cumsum(axis=0) == 1)).astype(int)
100 loops, best of 3: 5.19 ms per loop
# Proposed in this post : soln1
In [100]: %timeit per_col(a)
100 loops, best of 3: 3.4 ms per loop
# Proposed in this post : soln2
In [101]: %timeit ((a.argmax(0) == np.arange(a.shape[0])[:,None]).astype(int))*a.any(0)
100 loops, best of 3: 7.73 ms per loop
答案 1 :(得分:3)
您可以使用cumsum
计算您看到的1的数量,然后选择第一个:
In [42]: arr.cumsum(axis=0)
Out[42]:
matrix([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 2, 1, 2, 0, 0, 0, 0, 0],
[0, 0, 1, 2, 1, 3, 1, 0, 0, 0, 0],
[0, 0, 2, 2, 1, 3, 2, 0, 0, 0, 0],
[0, 1, 2, 2, 1, 3, 3, 1, 0, 1, 1],
[0, 2, 2, 2, 1, 3, 3, 2, 0, 2, 1],
[1, 2, 2, 2, 1, 3, 3, 3, 1, 3, 1]])
因此
In [43]: ((arr == 1) & (arr.cumsum(axis=0) == 1)).astype(int)
Out[43]:
matrix([[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]])
答案 2 :(得分:1)
另一种方法是:
for i in range(a.shape[1]):
a[np.where(a[:,i]==1)[0][1:],i] = 0
输出:
[[0 0 0 0 1 0 0 0 0 0 0]
[0 0 0 1 0 1 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0]
[0 0 1 0 0 0 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 1 0 1 1]
[0 0 0 0 0 0 0 0 0 0 0]
[1 0 0 0 0 0 0 0 1 0 0]]
答案 3 :(得分:1)
您可以使用非零且唯一的功能:
c, r = np.nonzero(np_array.T)
_, ind = np.unique(c, return_index=True)
np_array[:] = 0
np_array[r[ind], c[ind]] = 1
鉴于这个例子,结果是:
[[0 0 0 0 1 0 0 0 0 0 0]
[0 0 0 1 0 1 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0]
[0 0 1 0 0 0 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 1 0 1 1]
[0 0 0 0 0 0 0 0 0 0 0]
[1 0 0 0 0 0 0 0 1 0 0]]