Question

我正在用Google Collab编程一个相对较小的LSTM模型。

作为参考，我使用TensorFlow 1.13构建模型，并使用tensorflow.keras作为keras API。

seq_len = 20000; n_classes = 4
inputs = ll.Input(shape=(seq_len,))
x = ll.Embedding(len(word_index), 1000)(inputs)
x = ll.LSTM(units=100, activation='relu', return_sequences=True)(x)
outputs = ll.Dense(units = n_classes, activation='softmax')(x)
model = Model(inputs, outputs)
model.summary()

I have checked我有15 GB的GPU RAM，并且根据my estimations，该模型的批处理大小为32，应该适合3GB的RAM。

但是，每当我开始训练时，服务器内存就会耗尽。

为了公平起见，我正在使用非常长的数据序列（20000是最大序列长度），但是我希望模型在内存中象征性地展开并适合。

将批次大小减小为1也不起作用。

这是怎么回事？如何使该模型适合内存？

编辑：我尝试将序列长度减小为2，这确实使其适合内存。但是我需要序列长度保持较高。我怎样才能告诉Tensorflow在任何时候都不展开网络？（我怀疑这是幕后情况，如何检查是否确实如此？）

编辑：如果删除Softmax层，则内存使用率将再次降至正常范围。我认为Softmax层导致Tensorflow展开网络。但是，时间分配Softmax并没有帮助。

Answer 1

将LSTM层更改为CuDNNLSTM层就可以了！

inputs = ll.Input(shape=(seq_len,))
x = ll.Embedding(len(word_index), 1024)(inputs)
x = ll.CuDNNLSTM(units=100, return_sequences=True)(x)
x = ll.Dense(units = n_classes, activation='softmax')(x)
outputs = x
model = Model(inputs, outputs)

keras中的小型LSTM模型不适合我的GPU

1 个答案: