在Keras张量流中找到可变长度的损失掩码

时间:2019-04-15 10:05:22

标签: tensorflow keras deep-learning unsupervised-learning

试图构建损失函数,该函数捕获以下功能,一旦遇到“序列结束”,该函数将屏蔽输出值。

给出一个张量为[BatchSize,MaxSequenceLenght,OutputNodes]的张量

考虑以下示例


    batch size = 3
    Max Sequence Length=4
    OutputNodes = 3
    predicted = [[[0.1,0.3,0.2],[0.4,0.6,0.8],[0.5,0.2,0.3],[0.0,0.0,0.99]],
            [[0.1,0.3,0.2],[0.4,0.9,0.8],[0.5,0.2,0.9],[0.4,0.6,0.8]],
            [[0.1,0.3,0.2],[0.4,0.9,0.8],[0.5,0.2,0.1],[0.4,0.6,0.1]]]

我专用于最后一个输出节点,在这里node = 2象征“序列结束(EOS)”。节点标记为(0、1和2)

基于预测值,我必须返回一个掩码,该掩码试图查找首次出现的EOS。

在上面的示例中, 第一行具有以下顺序(argmax)=> 1,2,0,2

第二行具有以下顺序=> 1,1,2,2

第三行具有以下顺序=> 1,1,9,1

所以我的面具应该是

[[1,0,0,0],
[1,1,0,0],
[1,1,1,1]

掩码将确保在计算损失时忽略或不考虑EOS之后的值。

下面是我尝试过的代码片段


    sequence_cluster_asign = keras.backend.argmax(sequence_values,axis=-1)
    loss_mask = []
    for seq in K.tf.unstack(sequence_cluster_asign):
        ##appendEOS- To make sure tf.where is not empty
        seq = tf.concat([seq,endOfSequenceTensor],axis=0)
        endOfSequenceLocation = K.tf.where(K.tf.equal(seq,endOfSequence))[0][0]
        loss_mask.append(tf.sequence_mask(endOfSequenceLocation,max_decoder_seq_length,dtype=tf.float32))
    final_mask = K.stack(loss_mask)

遇到错误:ValueError:无法从形状(?,?)推断num

1 个答案:

答案 0 :(得分:1)

如果您想在问题中得到遮罩,可以使用以下方法。

import tensorflow as tf
import keras
from keras import backend as K

sequence_values = K.placeholder(shape=(None, 4, 3))
sequence_cluster_asign = keras.backend.argmax(sequence_values,axis=-1)

# keras version
result = K.cast(K.less(sequence_cluster_asign,sequence_values.get_shape().as_list()[-1]-1),dtype='int32')
result = K.cumprod(result,axis=-1)

# tensorflow version
# result = tf.cast(tf.less(sequence_cluster_asign,sequence_values.get_shape().as_list()[-1]-1),dtype=tf.int32)
# result = tf.cumprod(result,axis=-1)

predicted = [[[0.1,0.3,0.2],[0.4,0.6,0.8],[0.5,0.2,0.3],[0.0,0.0,0.99]],
             [[0.1,0.3,0.2],[0.4,0.9,0.8],[0.5,0.2,0.9],[0.4,0.6,0.8]],
             [[0.1,0.3,0.2],[0.4,0.9,0.8],[0.5,0.2,0.1],[0.4,0.6,0.1]]]

with tf.Session() as sess:
    print(result.eval(feed_dict={sequence_values:predicted}))

[[1 0 0 0]
 [1 1 0 0]
 [1 1 1 1]]