Question

我正在学习神经网络并在python中实现它。我首先定义了softmax函数，我遵循这个问题给出的解决方案Softmax function - python。以下是我的代码：

def softmax(A):
    """
    Computes a softmax function. 
    Input: A (N, k) ndarray.
    Returns: (N, k) ndarray.
    """
    s = 0
    e = np.exp(A)
    s = e / np.sum(e, axis =0)
    return s

我收到了一个测试代码，看看sofmax功能是否正确。 test_array是测试数据，test_output是softmax(test_array)的正确输出。以下是测试代码：

# Test if your function works correctly.
test_array = np.array([[0.101,0.202,0.303],
                       [0.404,0.505,0.606]]) 
test_output = [[ 0.30028906,  0.33220277,  0.36750817],
               [ 0.30028906,  0.33220277,  0.36750817]]
print(np.allclose(softmax(test_array),test_output))

但是根据我定义的softmax函数。按softmax(test_array)测试数据

print (softmax(test_array))

[[ 0.42482427  0.42482427  0.42482427]
 [ 0.57517573  0.57517573  0.57517573]]

有人能指出我定义的函数softmax的问题是什么吗？

Answer 1

试试这个：

In [327]: def softmax(A):
     ...:     e = np.exp(A)
     ...:     return  e / e.sum(axis=1).reshape((-1,1))

In [328]: softmax(test_array)
Out[328]:
array([[ 0.30028906,  0.33220277,  0.36750817],
       [ 0.30028906,  0.33220277,  0.36750817]])

或更好的这个版本，当大值被取幂时会阻止溢出：

def softmax(A):
    e = np.exp(A - np.max(A, axis=1).reshape((-1, 1)))
    return  e / e.sum(axis=1).reshape((-1,1))

Answer 2

问题出在你的总和上。你在轴0上求和，你应该保持轴0不受影响。

要对同一示例中的所有条目求和，即在同一行中，您必须使用轴1。

def softmax(A):
    """
    Computes a softmax function. 
    Input: A (N, k) ndarray.
    Returns: (N, k) ndarray.
    """
    e = np.exp(A)
    return e / np.sum(e, axis=1, keepdims=True)

使用keepdims保留形状，并能够将e除以总和。

在您的示例中，e的评估结果为：

[[ 1.10627664  1.22384801  1.35391446]
 [ 1.49780395  1.65698552  1.83308438]]

然后每个例子的总和（return行中的分母）是：

[[ 3.68403911]
 [ 4.98787384]]

然后该函数将每一行除以其总和，并给出test_output中的结果。

正如MaxU指出的那样，在取幂之前删除max是一个好习惯，以避免溢出：

e = np.exp(A - np.sum(A, axis=1, keepdims=True))

Answer 3

您可以自己打印[ 2.60408059 2.88083353 3.18699884]。您将看到它是一个包含3个元素e / np.sum(e, axis=0)的数组。然后e表示上面的3元素数组除np.sum(e, axis=0)的每个元素（也是3元素数组）。显然这不是你想要的。

您应该将np.sum(e, axis=1, keepdims=True)更改为[[ 3.68403911] [ 4.98787384]]，以便获得

website.com

相反，这是你真正想要的。你会得到正确的结果。

我建议您阅读the rules of broadcasting in numpy。它描述了加号/减法/乘法/除法如何在两个不同大小的数组上工作。

Answer 4

也许这可能具有启发性：

>>> np.sum(test_output, axis=1)
array([ 1.,  1.])

请注意，每一行都已规范化。换句话说，他们希望您独立计算每行的softmax。

神经网络中的Softmax函数（Python）

4 个答案: