删除numpy数组中添加为零的列的简便方法

时间:2017-04-27 03:56:35

标签: python arrays numpy

是否有一种简单的方法可以在numpy矩阵中删除加起来为零的列及其对应的行?

我正在尝试为PageRank创建转换矩阵,但我写的代码似乎并不是最有效的。

i = 1
while True:
    if len(graph) == i-1:
        break
    else:
        col_sum = np.sum(graph[:,i-1])
        if col_sum == 0:
            graph = np.delete(graph, np.s_[i-1], 1)
            graph = np.delete(graph, i-1, 0)
            nodes.remove(nodes[i-1])
            i = 0
        i += 1

2 个答案:

答案 0 :(得分:1)

这是使用mask = np.nonzero(np.sum(graph, axis = 1))[0] graph = graph[np.ix_(mask, mask)]

的矢量化版本
{{1}}

答案 1 :(得分:0)

Numpy旨在做没有循环的那种事情。像np.sum这样的大多数运算符都设计用于处理矩阵或多维数组,并使用axis参数告诉它运行哪个维度。最后,我们可以使用索引或布尔掩码数组来从数组中选择元素。

import numpy as np
np.random.seed(42)

nodes = [chr(i + 65) for i in range(10)]

a = (np.random.randn(10, 10) > 1.5).astype(int)
print('before:')
print(a)
print(nodes)

col_sum = np.sum(a, axis=0)  # sum of each column columns
idx = np.flatnonzero(col_sum)  # indices of non-zero columns

# remove columns and rows
a = a[:, idx][idx, :]  # note that a[idx, idx] won't work

# if nodes was an array we could do this:
#nodes = nodes[idx]

# but nodes is a list, so we need a list comprehension:
nodes = [n for n, i in zip(nodes, idx) if i]

print('\nafter:')
print(a)
print(nodes)

结果:

before:
[[0 0 0 1 0 0 1 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 1 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 1 0 1 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]]
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']

after:
[[0 0 0]
 [1 0 0]
 [0 0 0]]
['A', 'B', 'C']