是否有一种简单的方法可以在numpy矩阵中删除加起来为零的列及其对应的行?
我正在尝试为PageRank创建转换矩阵,但我写的代码似乎并不是最有效的。
i = 1
while True:
if len(graph) == i-1:
break
else:
col_sum = np.sum(graph[:,i-1])
if col_sum == 0:
graph = np.delete(graph, np.s_[i-1], 1)
graph = np.delete(graph, i-1, 0)
nodes.remove(nodes[i-1])
i = 0
i += 1
答案 0 :(得分:1)
这是使用mask = np.nonzero(np.sum(graph, axis = 1))[0]
graph = graph[np.ix_(mask, mask)]
{{1}}
答案 1 :(得分:0)
Numpy旨在做没有循环的那种事情。像np.sum
这样的大多数运算符都设计用于处理矩阵或多维数组,并使用axis
参数告诉它运行哪个维度。最后,我们可以使用索引或布尔掩码数组来从数组中选择元素。
import numpy as np
np.random.seed(42)
nodes = [chr(i + 65) for i in range(10)]
a = (np.random.randn(10, 10) > 1.5).astype(int)
print('before:')
print(a)
print(nodes)
col_sum = np.sum(a, axis=0) # sum of each column columns
idx = np.flatnonzero(col_sum) # indices of non-zero columns
# remove columns and rows
a = a[:, idx][idx, :] # note that a[idx, idx] won't work
# if nodes was an array we could do this:
#nodes = nodes[idx]
# but nodes is a list, so we need a list comprehension:
nodes = [n for n, i in zip(nodes, idx) if i]
print('\nafter:')
print(a)
print(nodes)
结果:
before:
[[0 0 0 1 0 0 1 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 1 0 1 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]]
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
after:
[[0 0 0]
[1 0 0]
[0 0 0]]
['A', 'B', 'C']