我正在为有关前提和假设的自然语言推理(蕴含)问题编写Keras模型。
我通过带有return_state = True的Bi-LSTM来运行前提和假设,然后使用通过这种最后状态向量初始化的另一个Bi-LSTM来独立地对前提和假设进行编码。在代码中,这是Bi-LSTM的创建:
LastStateLSTM = LSTM(hidden_units, implementation=2, return_sequences=False, name='laststate', return_state=True )
LastStateLSTM = Bidirectional(LastStateLSTM, name='bilstm1')
EncoderLSTM = LSTM(hidden_units, implementation=2, return_sequences=True, name='encoderlstm')
EncoderLSTM = Bidirectional(EncoderLSTM, name='bilstm2')
在这里,我要应用这些图层:
out1, forward_prem_h, forward_prem_c, backward_prem_h,backward_prem_c = LastStateLSTM(prem)
out2, forward_hyp_h, forward_hyp_c, backward_hyp_h, backward_hyp_c = LastStateLSTM(hyp)
u = EncoderLSTM(prem, initial_state=[forward_hyp_h, forward_hyp_c, backward_hyp_h, backward_hyp_c])
v = EncoderLSTM(hyp, initial_state=[forward_prem_h, forward_prem_c, backward_prem_h, backward_prem_c])
最后,有summary()和fit()方法:
final_model.compile(optimizer=Adam(amsgrad=True), loss='categorical_crossentropy', metrics=['accuracy'])
final_model.summary()
final_model.fit([X1_nli, X2_nli, overlapFeatures_nli, refutingFeatures_nli, polarityFeatures_nli, handFeatures_nli, cosFeatures_nli, bleu_nli, rouge_nli, cider_nli_train], \
Y_nli, validation_data=([X1_test_nli, X2_test_nli, overlapFeatures_nli_test, refutingFeatures_nli_test, polarityFeatures_nli_test, handFeatures_nli_test, cosFeatures_nli_test, bleu_nli_test, rouge_nli_test, cider_snli_test], Y_test_nli), \
callbacks=[checkpoint], epochs=1, batch_size=8)
但是,我发现,即使将模型摘要打印出来,将initial_state
参数传递给EncoderLSTM
时,训练也不会开始(也没有错误)。
我正在使用带有Theano后端的Keras 2.2.2。有人有类似的问题吗?
谢谢。