我正在尝试像这样训练Keras seq2seq编解码器模型:
Layer (type) Output Shape Param # Connected to
==================================================================================================
enc_inputs (InputLayer) (None, None) 0
__________________________________________________________________________________________________
dec_inputs (InputLayer) (None, None) 0
__________________________________________________________________________________________________
enc_embedding (Embedding) (None, None, 256) 1840384 enc_inputs[0][0]
__________________________________________________________________________________________________
dec_embedding (Embedding) (None, None, 256) 1291008 dec_inputs[0][0]
__________________________________________________________________________________________________
encoder_lstm (LSTM) [(None, 256), (None, 525312 enc_embedding[0][0]
__________________________________________________________________________________________________
dec_lstm (LSTM) [(None, None, 256), 525312 dec_embedding[0][0]
encoder_lstm[0][1]
encoder_lstm[0][2]
__________________________________________________________________________________________________
time_distributed_1 (TimeDistrib (None, None, 5043) 1296051 dec_lstm[0][0]
==================================================================================================
Total params: 5,478,067
Trainable params: 5,478,067
Non-trainable params: 0
__________________________________________________________________________________________________
None
Fit model
但是它在真正进行任何训练之前就被杀死了:
Process finished with exit code 137 (interrupted by signal 9: SIGKILL)
我尝试将64的批量大小减小为16,但出现相同的错误。
我正在Conda中使用python3.6环境在macOS上从PyCharm运行。我的python3.6进程正在使用大量虚拟RAM,在被杀死之前似乎达到了约100GB。奇怪的是,这曾经与我以前的模型一起使用,该模型的总参数约为14m,但是即使该模型也不再训练。我没有自觉改变任何东西-有一些TF版本,一些python / Conda环境。
我认为唯一的区别是从PyCharm CE切换到Professional。
那么问题:
答案 0 :(得分:0)
看起来像fun duplicate([], n, s) = [] |
duplicate(l, n, s) =
if n > 1 then hd l::duplicate(l, (n-1), s)
else hd l::duplicate(tl l, s, s);
fun expand([], n) = [] |
expand(l, n) = duplicate(l, n, n);
的问题。我做了tf.__version == 1.11
,模型运行良好。