Question

在我的程序中，我尝试使用其索引获取前10个数组项。数组的类型是ndarray。

for a in arr:
   print(a)

(0, 112354) 0.11235445
(0, 875) 0.155235445
(0, 6135) -0.14445445
...

我尝试使用numpy.sort并将数组作为参数传递，但它没有提供所需的结果。

如何获得前10个数组项及其索引？

已更新

pprint(arr)输出

<1x28382 sparse matrix of type '<class 'numpy.float64'>'
    with 18404 stored elements in Compressed Sparse Row format>

print(arr)返回：

 (0, 11098) 0.113315317878
  (0, 6775) 0.0513432082411
  (0, 5107) 0.0544519626112
  (0, 98)   0.059766413309
  (0, 27042)    0.104718642966
  (0, 22622)    0.104718642966
  (0, 6135) 0.104718642966

实际上arr是sklearn.svm.SVC.coef_对象。

感谢您的帮助。

Answer 1

由于这是一个稀疏矩阵，因此在a.data上工作效率更高。

一个简单的例子：

from numpy import *
import scipy
a=zeros(12,int)
a[:6]=range(6)
shuffle(a)
a=scipy.sparse.csr_matrix(a.reshape(4,3))
print(a.toarray());print(a)

# a is
[[4 0 1]
 [3 0 5]
 [2 0 0]
 [0 0 0]]

# or in csr format
(0, 1)  5.0
(0, 2)  3.0
(1, 2)  2.0
(3, 0)  1.0
(3, 1)  4.0

然后找到n个最大值，以及行和col关联的索引：

n=3 # the three biggest 
bigs=a.data.argsort()[:-n-1:-1] 
r,c=a.nonzero()
R,C=r[bigs],c[bigs]
print("the 3 biggest are in ",*zip(R,C))

给出：

the 3 biggest are in  (0, 1) (3, 1) (0, 2)

如何使用numpy.sort为ndarray

1 个答案: