我有一个numpy矩阵,这是一个NxM ndarray,看起来像下面的那个:
[
[ 0, 5, 11, 22, 0, 0, 11, 22],
[ 1, 4, 11, 20, 0, 4, 11, 20],
[ 1, 6, 11, 22, 0, 1, 11, 22],
[ 4, 7, 12, 21, 0, 4, 12, 21],
[ 5, 7, 12, 22, 0, 7, 12, 22],
[ 5, 7, 12, 22, 0, 5, 12, 22]
]
我想按行排序,先将每行放零,而不改变行中其他元素的顺序。
我想要的输出如下:
[
[ 0, 0, 0, 5, 11, 22, 11, 22],
[ 0, 1, 4, 11, 20, 4, 11, 20],
[ 0, 1, 6, 11, 22, 1, 11, 22],
[ 0, 4, 7, 12, 21, 4, 12, 21],
[ 0, 5, 7, 12, 22, 7, 12, 22],
[ 0, 5, 7, 12, 22, 5, 12, 22]
]
为了提高效率,我需要使用numpy(因此不建议切换到Python的常规嵌套列表并对它们进行计算)。代码越快越好。
我怎么能这样做?
最佳, 安德烈
答案 0 :(得分:2)
是否允许循环行?
>>> a
array([[ 0, 5, 11, 22, 0, 0, 11, 22],
[ 1, 4, 11, 20, 0, 4, 11, 20],
[ 1, 6, 11, 22, 0, 1, 11, 22],
[ 4, 7, 12, 21, 0, 4, 12, 21],
[ 5, 7, 12, 22, 0, 7, 12, 22],
[ 5, 7, 12, 22, 0, 5, 12, 22]])
>>> for row in a:
... row[:] = np.r_[row[row == 0], row[row != 0]]
...
>>> a
array([[ 0, 0, 0, 5, 11, 22, 11, 22],
[ 0, 1, 4, 11, 20, 4, 11, 20],
[ 0, 1, 6, 11, 22, 1, 11, 22],
[ 0, 4, 7, 12, 21, 4, 12, 21],
[ 0, 5, 7, 12, 22, 7, 12, 22],
[ 0, 5, 7, 12, 22, 5, 12, 22]])
答案 1 :(得分:2)
此方法获取二进制数组,其中数组为零且非零,然后获取该数组的排序索引,然后将其应用于原始数组。
你需要一个与你的待排序数组一样大的数组来保存索引,但由于它是所有numpy操作,它可能比循环更快。
ind = (a>0).astype(int)
ind = ind.argsort(axis=1)
a[np.arange(ind.shape[0])[:,None], ind]
输出:
>>> a
array([[ 0, 0, 0, 5, 11, 22, 11, 22],
[ 0, 1, 4, 11, 20, 4, 11, 20],
[ 0, 1, 6, 11, 22, 1, 11, 22],
[ 0, 4, 7, 12, 21, 4, 12, 21],
[ 0, 5, 7, 12, 22, 7, 12, 22],
[ 0, 5, 7, 12, 22, 5, 12, 22]])
答案 2 :(得分:1)
可能不是最有效的,因为它在线上循环,但可能是一个很好的起点:
import numpy as np
a = np.array([[ 0, 5, 11, 22, 0, 0, 11, 22],
[ 1, 4, 11, 20, 0, 4, 11, 20],
[ 1, 6, 11, 22, 0, 1, 11, 22],
[ 4, 7, 12, 21, 0, 4, 12, 21],
[ 5, 7, 12, 22, 0, 7, 12, 22],
[ 5, 7, 12, 22, 0, 5, 12, 22]])
size = a.shape[1]
for i, line in enumerate(a):
nz = np.nonzero(a[i][:])[0]
z = np.zeros(size - nz.shape[0])
a[i][:] = np.concatenate((z,a[i][:][np.nonzero(a[i][:])]))
对于a
中的每一行,您会找到非零索引并添加一些零以匹配大小。
答案 3 :(得分:0)
有可能摆脱所有的Python循环,在np.tile
和np.repeat
的帮助下构建一个布尔掩码,尽管你需要在一些更大的例子上计时,看它是否值得额外的复杂性:
rows, cols = a.shape
mask = a != 0
nonzeros_per_row = mask.sum(axis=1)
repeats = np.column_stack((cols-nonzeros_per_row, nonzeros_per_row)).ravel()
new_mask = np.repeat(np.tile([False, True], rows), repeats).reshape(rows, cols)
out = np.zeros_like(a)
out[new_mask] = a[mask]
>>> a
array([[ 0, 5, 11, 22, 0, 0, 11, 22],
[ 1, 4, 11, 20, 0, 4, 11, 20],
[ 1, 6, 11, 22, 0, 1, 11, 22],
[ 4, 7, 12, 21, 0, 4, 12, 21],
[ 5, 7, 12, 22, 0, 7, 12, 22],
[ 5, 7, 12, 22, 0, 5, 12, 22]])
>>> out
array([[ 0, 0, 0, 5, 11, 22, 11, 22],
[ 0, 1, 4, 11, 20, 4, 11, 20],
[ 0, 1, 6, 11, 22, 1, 11, 22],
[ 0, 4, 7, 12, 21, 4, 12, 21],
[ 0, 5, 7, 12, 22, 7, 12, 22],
[ 0, 5, 7, 12, 22, 5, 12, 22]])