Question

我试图根据序列的实际长度进行Mean运算。（掩蔽零矢量）

我的输入sequence_outpus属于（batch_size，max_len，dimensions）

我有一个张量，用于存储批次中每个序列的实际长度。我使用了https://danijar.com/variable-sequence-lengths-in-tensorflow/

中的函数

 def length(sequence):
     used = tf.sign(tf.reduce_max(tf.abs(sequence), reduction_indices=2))
     length = tf.reduce_sum(used, reduction_indices=1)
     length = tf.cast(length, tf.int64)
     return length

我这样做：

lengths = length(sequence_outputs)
lengths = tf.cast(length, tf.float32) 
lengths = tf.expand_dims(lengths,1)
sentence_outputs = tf.reduce_sum(sentence_outputs,1) / lengths

图表编译但我得到NaN损失值。此外，使用eval（）进行调试时，我的长度变为负值。

这似乎是一个简单的问题，但我已经坚持了一段时间，并希望得到一些帮助！

谢谢！

Answer 1

我认为没有问题。您的代码有点过于复杂。以下代码

import numpy as np
import tensorflow as tf

# creating data
B = 15
MAX_LEN = 4
data = np.zeros([B, MAX_LEN], dtype=np.float32)

for b in range(B):
    current_len = np.random.randint(2, MAX_LEN)
    current_vector = np.concatenate([np.random.randn(current_len), np.zeros(MAX_LEN - current_len)], axis=-1)
    print("{}\t\t{}".format(current_vector, current_vector.shape))
    data[b, ...] = current_vector

data_op = tf.convert_to_tensor(data)


def tf_length(x):
    assert len(x.get_shape().as_list()) == 2
    length = tf.count_nonzero(x, axis=1, keepdims=True)
    return length


x = tf.reduce_sum(data_op, axis=1) / tf_length(data_op)

# test gradients
grads = tf.gradients(tf.reduce_mean(x), [data_op])

with tf.Session() as sess:
    print sess.run(grads)

在没有任何NaN的情况下完全运行良好。你确定，你真的在使用这段代码吗？如果我需要猜测，我打赌你会忘记序列长度计算中的tf.abs。

请注意：您的长度函数以及此帖子中的tf_length假设序列中的值为非零值！计算序列长度应该是数据生成者的任务并且被馈送到计算图中。其他一切，我认为是一个hacky解决方案。

具有动态长度的TensorFlow平均值

1 个答案: