Question

我想在keras模型中更改张量的形状和内容。张量是图层的输出，具有

shape1=(batch_size, max_sentences_in_doc, max_tokens_in_doc, embedding_size)

我想转换为

shape2=(batch_size, max_documents_length, embedding_size)

适合作为下一层的输入。这里的句子是由标记组成的，并且补零，因此每个句子都有length=max_tokens_in_sentence。详细信息：

我希望将仅包含句子非零部分的所有句子中的所有句子连接起来；
然后我将该串联零填充到length=max_document_length。

因此，由于涉及数学运算，从shape1到shape2的转变不仅是重塑。

我创建了一个函数embedding_to_docs(x)，该函数在shape1的张量上进行迭代以将其转换为shape2。我在模型中使用Lambda层调用该函数，该函数可在调试虚拟数据时使用，但是当我在模型构建期间尝试调用该函数时，会引发错误：

Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.

def embedding_to_docs(x):
    new_output = []
    for doc in x:
        document = []
        for sentence in doc:
            non_zero_indexes = np.nonzero(sentence[:, 0])
            max_index = max(non_zero_indexes[0])
            if max_index > 0:
                document.extend(sentence[0:max_index])
        if MAX_DOCUMENT_LENGTH-len(document) > 0:
            a = np.zeros((MAX_DOCUMENT_LENGTH-len(document), 1024))
            document.extend(a)
        else:
            document = document[0:MAX_DOCUMENT_LENGTH]
        new_output.append(document)

    return np.asarray(new_output)

...

# in the model:
tensor_of_shape2 = Lambda(embedding_to_docs)(tensor_of_shape1)

该如何解决？

Answer 1

您可以使用here，它允许您从图形模式（由Keras使用）切换到急切模式（可以像在函数中那样遍历张量）。

def to_docs(x):
  return tf.py_function(embedding_to_docs, [x], tf.float32)

tensor_of_shape2 = Lambda(to_docs)(tensor_of_shape1)

请注意，您的embedding_to_docs中运行的代码必须使用tensorflow eager而不是numpy编写。这意味着您需要将一些numpy调用替换为tensorflow。您肯定需要用以下命令替换回车线：

return tf.convert_to_tensor(new_output)

使用numpy数组将停止梯度计算，但是无论如何您对梯度流经输入数据都不感兴趣。

尝试在keras模型中更改张量形状时发生错误

1 个答案: