检查目标时出错:预期密度为3维,但数组的形状为(32,200)

时间:2018-12-14 17:21:13

标签: python tensorflow keras tensorflow-datasets

我正在尝试修改https://www.tensorflow.org/tutorials/sequences/text_generation上的示例以生成基于字符的文本。

示例中的代码使用Tensorflow Eager Execution(通过tensorflow.enable_eager_execution)并运行良好,但是如果我禁用急切的执行,则会开始收到此错误:

  

检查目标时出错:预期密度为3维,但数组的形状为(32,200)

为什么会这样?在启用或不启用Eager的情况下,代码是否应该完全相同?

我尝试展平LSTM层的输出,但出现类似错误:

  

ValueError:检查目标时出错:预期密度为形状(1,),但形状为(200,)的阵列

我能做的最简单的代码如下:

import tensorflow as tf
import numpy as np

# tf.enable_eager_execution()

def get_input():
    path_to_file = tf.keras.utils.get_file(
        'shakespeare.txt',
        'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt'
    )
    with open(path_to_file) as f:
        text = f.read()
    return text


def get_dataset(text_as_indexes, sequence_size, sequences_per_batch):
    def split_input(sequence):
        return sequence[:-1], sequence[1:]

    data_set = tf.data.Dataset.from_tensor_slices(text_as_indexes)
    data_set = data_set.batch(sequence_size + 1, drop_remainder=True)
    data_set = data_set.map(split_input)
    data_set = data_set.shuffle(10000).batch(sequences_per_batch, drop_remainder=True)
    return data_set


if __name__ == '__main__':
    sequences_len = 200
    batch_size = 32
    embeddings_size = 64
    rnn_units = 128

    text = get_input()
    vocab = sorted(set(text))
    vocab_size = len(vocab)

    char2int = {c: i for i, c in enumerate(vocab)}
    int2char = np.array(vocab)
    text_as_int = np.array([char2int[c] for c in text])

    dataset = get_dataset(text_as_int, sequences_len, batch_size)
    steps_per_epoch = len(text_as_int) // sequences_len // batch_size

    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Embedding(
        input_dim=vocab_size,
        output_dim=embeddings_size,
        input_length=sequences_len))

    model.add(tf.keras.layers.LSTM(
        units=rnn_units,
        return_sequences=True))

    model.add(tf.keras.layers.Dense(units=vocab_size, activation='softmax'))

    model.compile(optimizer=tf.train.AdamOptimizer(),
                  loss='sparse_categorical_crossentropy')

    model.summary()
    model.fit(
        x=dataset.repeat(),
        batch_size=batch_size,
        steps_per_epoch=steps_per_epoch)

1 个答案:

答案 0 :(得分:1)

使用sparse_categorical_crossentropy时,标签的形状应为(batch_size, sequence_length, 1),而不是(batch_size, sequence_length)。您可以通过解决此问题 重塑split_input()函数中的标签,如下所示:

def split_input(sequence):
    return sequence[:-1], tf.reshape(sequence[1:], (-1,1))

上面的代码适用于急切执行和正常执行。