Question

我的网络的倒数第二层的形状为(U, C)，其中C是通道数。我想在每个通道上分别应用softmax函数。

例如，如果U=2和C=3，并且该图层生成[ [1 2 3], [10 20 30] ]，我希望输出对通道0和{{1}做softmax(1, 2, 3) }频道1。

我可以用Keras做到这一点吗？我正在使用TensorFlow作为后端。

更新

还请说明如何确保损失是两个交叉熵的总和，以及如何验证这一点？（也就是说，我不希望优化器仅针对softmax之一进行损耗训练，而是针对每个交叉熵损耗的总和进行训练）。该模型使用Keras内置的softmax(10, 20, 30)进行损失。

Answer 1

将功能性api用于多个输出。 https://keras.io/getting-started/functional-api-guide/

input = Input(...)
...
t = some_tensor
t0 = t0[:,:,0]
t1 = t0[:,:,1]
soft0 = Softmax(output_shape)(t0)
soft1 = Softmax(output_shape)(t1)
outputs = [soft0,soft1]
model = Model(inputs=input, outputs=outputs)
model.compile(...)
model.fit(x_train, [y_train0, ytrain1], epoch = 10, batch_size=32)

Answer 2

定义一个Lambda层，并使用具有所需轴的后端的softmax函数来计算该轴上的softmax：

from keras import backend as K
from keras.layers import Lambda

soft_out = Lambda(lambda x: K.softmax(x, axis=my_desired_axis))(input_tensor)

更新：具有N维的numpy数组的形状为(d1, d2, d3, ..., dn)。其中每个称为轴。因此，第一个轴（即axis=0）的尺寸为d1，第二个轴（即axis=1）的尺寸为d2，依此类推。此外，数组的最常见情况是二维数组或形状为(m, n)（即m行（即axis=0）和n列（即axis=1）。现在，当我们指定执行操作的轴时，这意味着应在该轴上计算操作。让我通过示例更清楚地说明这一点：

>>> import numpy as np
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

>>> a.shape
(3, 4)   # three rows and four columns

>>> np.sum(a, axis=0)  # compute the sum over the rows (i.e. for each column)
array([12, 15, 18, 21])

>>> np.sum(a, axis=1)  # compute the sum over the columns (i.e. for each row)
array([ 6, 22, 38])

>>> np.sum(a, axis=-1) # axis=-1 is equivalent to the last axis (i.e. columns)
array([ 6, 22, 38])

现在，在您的示例中，用于计算softmax函数的情况也相同。您必须首先确定要在哪个轴上计算softmax，然后使用axis参数进行指定。此外，请注意，默认情况下，softmax应用于最后一个轴（即axis=-1），因此，如果要在最后一个轴上进行计算，则不需要上面的Lambda层。只需使用Activation层即可：

from keras.layers import Activation

soft_out = Activation('softmax')(input_tensor)

更新2：还有另一种使用Softmax层的方法：

from keras.layers import Softmax

soft_out = Softmax(axis=desired_axis)(input_tensor)

在Tensorflow和Keras的两个通道上产生softmax

更新

2 个答案: