我尝试使用带有tensorflow的2层LSTM训练语言模型,并在此处遵循代码https://github.com/tsungruihon/RNN-language-model/blob/master/RNNLM.py。但是该程序在训练的一半(大约训练数据的45%)中发生OOM
。主要参数如下:
batch iterator
:是learning rate decay
:是padding batch
:是num of LSTM layers
:2 num of hidden units
:128 vocab size
:50000 batch size
:64 num of epoch
:1 num of training data
:20,000,000 num of validation data
:1,000,000 loss function
:tf.nn.sparse_softmax_cross_entropy_with_logits
Optimizer
:AdagradOptimizer
num of GPU
:1 memory of GPU
:11G 我是张量流和语言模型的新手。现在,我认为主要问题是loss function
。希望有人能给我提示。非常感谢!