Question

编写模型以尝试使用LSTM根据示例生成真实文本。

这里是代码的要点

# ...
path = 'lyrics.txt'
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower()
print('corpus length:', len(text))

chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 140

step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1


# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, dropout_W=0.5, return_sequences=True, input_shape=(maxlen, len(chars))))
model.add(LSTM(128, dropout_W=0.5, return_sequences=True))
model.add(LSTM(128, dropout_W=0.5, return_sequences=True))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam')

编辑它以尝试查看堆叠多个LSTM的结果，以获得此错误

Using TensorFlow backend.
corpus length: 381090
total chars: 67
nb sequences: 126984
Vectorization...
Build model...
char_lstm.py:55: UserWarning: Update your `LSTM` call to the Keras 2 API: `LSTM(128, return_sequences=True, drop
out=0.5, input_shape=(140, 67))`
  model.add(LSTM(128, dropout_W=0.5, return_sequences=True, input_shape=(maxlen, len(chars))))
char_lstm.py:56: UserWarning: Update your `LSTM` call to the Keras 2 API: `LSTM(128, return_sequences=True, drop
out=0.5)`
  model.add(LSTM(128, dropout_W=0.5, return_sequences=True))
char_lstm.py:57: UserWarning: Update your `LSTM` call to the Keras 2 API: `LSTM(128, return_sequences=True, drop
out=0.5)`
  model.add(LSTM(128, dropout_W=0.5, return_sequences=True))
Traceback (most recent call last):
  File "char_lstm.py", line 110, in <module>
    callbacks=[print_callback])
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 1002, in fit
    validation_steps=validation_steps)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1630, in fit
    batch_size=batch_size)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1480, in _standardize_user_data
    exception_prefix='target')
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 113, in _standardize_input_data
    'with shape ' + str(data_shape))
ValueError: Error when checking target: expected activation_1 to have 3 dimensions, but got array with shape (12
6984, 67)

相信最后一层model.add(Dense(len(chars)))可能是错误的来源，我知道代码的作用。但是在黑暗中多次拍摄后需要找到一个合适的解决方案，更重要的是要了解解决方案如何链接到错误。

Answer 1

你很接近，问题在Dense(len(chars))左右，因为你在最后一个LSTM中也使用了return_sequences=True，你实际上正在返回一个3D张量的形状（batch_size，maxlen，128）。现在，Dense和softmax都可以处理更高维度的张量，它们在最后一个维度axis=-1上运行，但这也会导致它们返回序列。您有多对多模型，而您的数据多对一。您有两个选择：

您可以删除上一个LSTM的返回序列以压缩上下文，将过去的标记转换为大小为128的单个矢量表示，然后根据该预测进行预测。
如果您坚持要求所有过去的字词中的信息，那么在传递给Flatten()进行预测之前，您需要Dense。

顺便说一句，你可以使用Dense(len(chars), activation='softmax')在一行中达到相同的效果。

期望activation_1具有3个维度，但是具有形状的数组（12 6984,67）

1 个答案: