Question

我从包含a个唯一值（N）的数组product(a.shape) >= N开始。
我需要在b中0 .. N-1中各个元素位置的a中的唯一值的（已排序）列表中找到具有索引a的数组import numpy as np np.random.seed(42) a = np.random.choice([0.1,1.3,7,9.4], size=(4,3)) print a。

作为一个例子

将[[ 7. 9.4 0.1] [ 7. 7. 9.4] [ 0.1 0.1 7. ] [ 1.3 7. 7. ]]打印为

[0.1, 1.3, 7.0, 9.4]

唯一值为b，因此所需的结果[[2 3 0] [2 2 3] [0 0 2] [1 2 2]]将为

a[0,0]

（例如，7.的值为7.; 2的索引为b[0,0] == 2;因此u = np.unique(a).tolist() af = a.flatten() b = np.empty(len(af), dtype=int) for i in range(len(af)): b[i] = u.index(af[i]) b = b.reshape(a.shape) print b。）

自numpy does not have an index function以来，我可以使用循环来做到这一点。循环输入数组，如下所示：

u = np.unique(a)
b = np.empty(a.shape, dtype=int)
for i in range(len(u)):
    b[np.where(a == u[i])] = i
print b

或循环遍历唯一值，如下所示：

我认为，在b中并非所有值都不同的情况下，循环唯一值的第二种方法已经比第一种方法更有效;但是，它仍然涉及这个循环，与现场操作相比效率很低。

所以我的问题是：获得数组a的最有效方法是什么，其中填充了import ( "syscall" "os" ) func main(){ fErr, err = os.OpenFile("Errfile", os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0600) syscall.Dup2(int(fErr.Fd()), 1) /* -- stdout */ syscall.Dup2(int(fErr.Fd()), 2) /* -- stderr */ }的唯一值的显示？

Answer 1

您可以将np.unique与其可选参数return_inverse -

一起使用

np.unique(a, return_inverse=1)[1].reshape(a.shape)

示例运行 -

In [308]: a
Out[308]: 
array([[ 7. ,  9.4,  0.1],
       [ 7. ,  7. ,  9.4],
       [ 0.1,  0.1,  7. ],
       [ 1.3,  7. ,  7. ]])

In [309]: np.unique(a, return_inverse=1)[1].reshape(a.shape)
Out[309]: 
array([[2, 3, 0],
       [2, 2, 3],
       [0, 0, 2],
       [1, 2, 2]])

通过对我来说非常有效的source code of np.unique，但仍然修剪掉了不必要的部分，我们最终会得到另一种解决方案，就像这样 -

def unique_return_inverse(a):
    ar = a.flatten()     
    perm = ar.argsort()
    aux = ar[perm]
    flag = np.concatenate(([True], aux[1:] != aux[:-1]))
    iflag = np.cumsum(flag) - 1
    inv_idx = np.empty(ar.shape, dtype=np.intp)
    inv_idx[perm] = iflag
    return inv_idx

计时 -

In [444]: a= np.random.randint(0,1000,(1000,400))

In [445]: np.allclose( np.unique(a, return_inverse=1)[1],unique_return_inverse(a))
Out[445]: True

In [446]: %timeit np.unique(a, return_inverse=1)[1]
10 loops, best of 3: 30.4 ms per loop

In [447]: %timeit unique_return_inverse(a)
10 loops, best of 3: 29.5 ms per loop

内置的内容没有太大的改进。

一系列独特的价值观

1 个答案: