Question

我正在尝试使用多个softmax输出创建一个张量流网络，每个输出都有不同的大小。网络架构是：输入 - ＆gt; LSTM - ＆gt;退出。然后我有2个softmax层：10个输出的Softmax和20个输出的Softmax。这是因为我想生成两组输出（10和20），然后将它们组合起来产生最终输出。我不确定如何在Tensorflow中执行此操作。

以前，要制作一个类似于描述的网络，但只有一个softmax，我想我可以做这样的事情。

inputs = tf.placeholder(tf.float32, [batch_size, maxlength, vocabsize])
lengths = tf.placeholders(tf.int32, [batch_size])
embeddings = tf.Variable(tf.random_uniform([vocabsize, 256], -1, 1))
lstm = {}
lstm[0] = tf.contrib.rnn.LSTMCell(hidden_layer_size, state_is_tuple=True, initializer=tf.contrib.layers.xavier_initializer(seed=random_seed))
lstm[0] = tf.contrib.rnn.DropoutWrapper(lstm[0], output_keep_prob=0.5)
lstm[0] = tf.contrib.rnn.MultiRNNCell(cells=[lstm[0]] * 1, state_is_tuple=True)
output_layer = {}
output_layer[0] = Layer.W(1 * hidden_layer_size, 20, 'OutputLayer')
output_bias = {}
output_bias[0] = Layer.b(20, 'OutputBias')
outputs = {}
fstate = {}
with tf.variable_scope("lstm0"):
    # create the rnn graph at run time
  outputs[0], fstate[0] = tf.nn.dynamic_rnn(lstm[0], tf.nn.embedding_lookup(embeddings, inputs),
                                      sequence_length=lengths, 
                                      dtype=tf.float32)
logits = {}
logits[0] = tf.matmul(tf.concat([f.h for f in fstate[0]], 1), output_layer[0]) + output_bias[0]
loss = {}
loss[0] = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits[0], labels=labels[0]))

然而，现在，我希望我的RNN输出（在丢失之后）流入2个softmax层，其中一个是10号，另一个是20号。有人知道如何做到这一点吗？

谢谢

编辑：理想情况下，我想使用softmax版本，例如此Knet Julia库中定义的内容。 Tensorflow有同等效力吗？ https://github.com/denizyuret/Knet.jl/blob/1ef934cc58f9671f2d85063f88a3d6959a49d088/deprecated/src7/op/actf.jl#L103

Answer 1

您没有为代码中的10号softmax图层定义日志，您必须明确地执行此操作。

完成后，您可以使用tf.nn.softmax，将其单独应用于两个logit张量。

例如，对于你的20级softmax张量：

softmax20 = tf.nn.softmax(logits[0])

对于其他图层，您可以执行以下操作：

output_layer[1] = Layer.W(1 * hidden_layer_size, 10, 'OutputLayer10')
output_bias[1] = Layer.b(10, 'OutputBias10')

logits[1] = tf.matmul(tf.concat([f.h for f in fstate[0]], 1), 
output_layer[1]) + output_bias[1]

softmax10 = tf.nn.softmax(logits[1])

还有一个tf.contrib.layers.softmax，它允许你在张量的最终轴上应用softmax，大于2维，但看起来你不需要这样的东西。 tf.nn.softmax应该在这里工作。

旁注： output_layer不是该列表的最大名称 - 应该是涉及权重的内容。这些权重和偏差（output_layer，output_bias）也不代表您网络的输出层（因为这将来自您对softmax输出的任何操作，对吗？）。 [抱歉，无法帮助自己。]

Answer 2

您可以对dynamic_rnn的{{1}}输出执行以下操作，以便计算两个softmax和相应的损失：

output[0]

如果与您的申请相关，您可以将两种损失结合起来：

with tf.variable_scope("softmax_0"):
    # Transform you RNN output to the right output size = 10
    W = tf.get_variable("kernel_0", [output[0].get_shape()[1], 10])
    logits_0 = tf.matmul(inputs, W)
    # Apply the softmax function to the logits (of size 10)
    output_0 = tf.nn.softmax(logits_0, name = "softmax_0")
    # Compute the loss (as you did in your question) with softmax_cross_entropy_with_logits directly applied on logits
    loss_0 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits_0, labels=labels[0]))

with tf.variable_scope("softmax_1"):  
    # Transform you RNN output to the right output size = 20
    W = tf.get_variable("kernel_1", [output[0].get_shape()[1], 20])
    logits_1 = tf.matmul(inputs, W)
    # Apply the softmax function to the logits (of size 20)
    output_1 = tf.nn.softmax(logits_1, name = "softmax_1")
    # Compute the loss (as you did in your question) with softmax_cross_entropy_with_logits directly applied on logits
    loss_1 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits_1, labels=labels[1]))

修改要回答您关于您对两个softmax输出特别需要做什么的评论中的问题：您可以执行以下操作：

total_loss = loss_0 + loss_1

您只需要定义with tf.variable_scope("second_part"): W1 = tf.get_variable("W_1", [output_1.get_shape()[1], n]) W2 = tf.get_variable("W_2", [output_2.get_shape()[1], n]) prediction = tf.matmul(output_1, W1) + tf.matmul(output_2, W2) with tf.variable_scope("optimization_part"): loss = tf.reduce_mean(tf.squared_difference(prediction, label))，即W1和W2的列数。

如何在Tensorflow中有多个Softmax输出？

2 个答案: