Question

我想通过比较输出信号和它的真实输出值计算每个输入的神经网络的输出误差，所以我需要两个矩阵来计算这个任务。

我有（n * 1）形状的输出矩阵，但在标签中我只有应该被激活的神经元的索引，所以我需要一个相同形状的矩阵，所有元素都等于零，除了那个它的索引等于标签。我可以用一个函数做到这一点，但我想知道numpy python中有一个内置方法可以为我做这个吗？

Answer 1

您可以使用numpy或标准库以多种方式执行此操作，一种方法是创建一个零数组，并将index对应的值设置为1.

n = len(result)

a = np.zeros((n,)); 
a[id] = 1

它可能也是最快的一个：

>> %timeit a = np.zeros((n,)); a[id] = 1
1000000 loops, best of 3: 634 ns per loop

或者，您可以使用numpy.pad用[0]填充[1]数组。但由于填充逻辑，这几乎肯定会变慢。

np.lib.pad([1],(id,n-id),'constant', constant_values=(0))

正如预期的数量级更慢：

>> %timeit np.lib.pad([1],(id,n-id),'constant', constant_values=(0))
10000 loops, best of 3: 47.4 µs per loop

您可以按照评论的建议尝试列表理解：

results = [7]

np.matrix([1 if x == id else 0 for x in results])

但它比第一种方法慢得多：

>> %timeit np.matrix([1 if x == id else 0 for x in results])
100000 loops, best of 3: 7.25 µs per loop

编辑：但在我看来，如果你想计算神经网络的错误。您应该只使用np.argmax并计算它是否成功。该错误计算可能会给您带来比有用更多的噪音。如果您觉得您的网络容易出现相似现象，您可以制作混淆矩阵。

Answer 2

其他一些方法似乎也比上面的@ umutto慢：

%timeit a = np.zeros((n,)); a[id] = 1 #umutto's method
The slowest run took 45.34 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.53 µs per loop

布尔结构：

%timeit a = np.arange(n) == id
The slowest run took 13.98 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.76 µs per loop

布尔结构为整数：

%timeit a = (np.arange(n) == id).astype(int)
The slowest run took 15.31 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.47 µs per loop

列表构造：

%timeit a = [0]*n; a[id] = 1; a=np.asarray(a)
10000 loops, best of 3: 77.3 µs per loop

使用scipy.sparse

%timeit a = sparse.coo_matrix(([1], ([id],[0])), shape=(n,1))
10000 loops, best of 3: 51.1 µs per loop

现在实际上更快的可能取决于缓存的内容，但似乎构建零数组似乎最快，特别是如果你可以使用np.zeros_like(result)而不是np.zeros(len(result))

Answer 3

一个班轮：

x = np.identity(n)[id]

构造矩阵的有效方法，所有元素为零，除了numpy中的一个

3 个答案: