我正在使用fit_generator
来训练翻译模型。生成器代码添加在下面。
My_Generator(Sequence)类:
def __init__(self, X_data, Y_data, batch_size):
self.X_data, self.Y_data = X_data, Y_data
#print(X_data)
self.batch_size = batch_size
def __len__(self):
print(int(ceil(len(self.X_data) / float(self.batch_size))))
return int(ceil(len(self.X_data) / float(self.batch_size)))
def __getitem__(self, idx):
#print('hey')
batch_x = self.X_data[idx * self.batch_size:(idx + 1) * self.batch_size]
batch_y = self.Y_data[idx * self.batch_size:(idx + 1) * self.batch_size]
hindi_data= encode_sequences(hin_tokenizer,batch_x,hin_max_len_sent)
eng_data = encode_sequences(eng_tokenizer,batch_y,eng_max_len_sent)
output = encode_output(eng_data,eng_vocab_size)
return array(hindi_data),array(output)
我已经训练了30个纪元,分别给出了val_loss: 0.5665
和val_acc: 0.9268
。
预测代码:
def predict_sequence(model, tokenizer, source):
prediction = model.predict(source, verbose=0)[0]
integers = [argmax(vector) for vector in prediction]
target = list()
for i in integers:
word = word_for_id(i, tokenizer)
print('target ' ,word)
if word is None:
break
target.append(word)
return ' '.join(target)
保存的模型未给出目标语言的正确预测。我的bleu值为0。我也尝试使用predict_generator
。但这似乎也不起作用。在这方面的任何帮助表示赞赏。