Question

我使用tensorflow使用LSTM构建seq2seq模型。我使用的损失函数是softmax交叉熵损失。问题是我的输入序列有不同的长度，所以我填充它。模型的输出具有形状[max_length, batch_size, vocab_size]。如何计算0填充值不会影响损失的损失？ tf.nn.softmax_cross_entropy_with_logits提供了轴参数，因此我们可以用三维计算损失，但它不能提供权重。 tf.losses.softmax_cross_entropy提供权重参数，但它接收形状为[batch_size, nclass(vocab_size)]的输入。请帮忙！

Answer 1

我认为你必须编写自己的损失函数。查看https://danijar.com/variable-sequence-lengths-in-tensorflow/。

Answer 2

在这种情况下，您需要填充两个logit和标签，以使它们具有相同的长度。因此，如果张量createKey()的大小为import java.security.GeneralSecurityException; import java.security.KeyPair; import java.security.KeyPairGenerator; import java.security.Security; import org.bouncycastle.jce.provider.BouncyCastleProvider; import org.jbenchx.annotations.Bench; import org.jbenchx.annotations.ForEachInt; public class keyGen { private KeyPair generateKeyPair(int size) throws GeneralSecurityException { KeyPairGenerator keyPairGen = KeyPairGenerator.getInstance("RSA", "BC"); keyPairGen.initialize(size); return keyPairGen.generateKeyPair(); } @Bench public Object createkey(@ForEachInt({ 112, 196, 256 }) int size) throws GeneralSecurityException { Security.addProvider(new BouncyCastleProvider()); KeyPair RSA = generateKeyPair(size); return RSA; } }，而张量Initializing Benchmarking Framework... Running on Windows 7 6.1 Max heap = 3784310784 System Benchmark = 0,84ns Performing 3 benchmarking tasks.. [0] keyGen.createkey(112)!!!!!!!!!!.!.!!!!!!!!!!!!.!!.!!!..!!!*!!..*..*...* 952us [1] keyGen.createkey(196)!!!!!..*....*...* 1.94ms [2] keyGen.createkey(256)!!!!!!!!*.*!!*.*.*!.*.*!*.**.****. 2.88ms Success.的大小为logits，其中(batch_size, length, vocab_size)是序列的大小。首先，您必须将它们填充相同的长度：

labels

然后您可以执行填充的交叉熵：

(batch_size, length)

Answer 3

下面的函数采用两个张量，其形状为（batch_size，time_steps，vocab_len）。计算掩码以将与填充相关的时间步调零。遮罩将消除分类交叉熵中的填充损失。

# the labels that has 1 as the first element
def mask_loss(y_true, y_pred):
    mask_value = np.zeros((vocab_len))
    mask_value[0] = 1
    # find out which timesteps in `y_true` are not the padding character 
    mask = K.equal(y_true, mask_value)
    mask = 1 - K.cast(mask, K.floatx())
    mask = K.sum(mask,axis=2)/2
    # multplying the loss by the mask. the loss for padding will be zero
    loss = tf.keras.layers.multiply([K.categorical_crossentropy(y_true, y_pred), mask])
    return K.sum(loss) / K.sum(mask)

长度序列的张量流交叉熵损失

3 个答案: