我对我所制作的模型有一些问题。
1)有时预测会返回一个0的数组(但这可能是正常的)。
2)如果我多次重复预测(使用循环),我总会得到相同的结果。
3)奇怪的是,如果我停止程序并重新运行,预测会发生变化(第2点仍然适用)。
4)结果是错误的。我插入与训练相同的确切查询(训练时预测的响应是正确的),但我的反应非常不同。
现在,奇怪的是,虽然训练我可以看到它正在改善并且它运作良好。
以下是显示正确结果(训练)的代码:
decoder_prediction = tf.argmax(decoder_logits, 2) # This line isn't actually here, I put it here to give context
saver.save(sess, MODEL_LOCATION)
fd = next_feed()
predict_ = sess.run(decoder_prediction, fd) # fd stands for feed_dict. This provides {encoder_inputs: .., encoder_inputs_lenghts: .., decoder_targets: ..}
for i, (inp, pred) in enumerate(zip(fd[encoder_inputs].T, predict_.T)): # This is useless but I did not change it as I dont quite understand it (the zip(fd...) part)
index = random.randint(0, len(fd[encoder_inputs].T)-1) # This is used to pick a random query/prediction
print(' sample {}'.format(i + 1))
print(' Query > {}'.format(batch_to_words(fd[encoder_inputs].T[index])))
print(' Predicted Response > {}'.format(batch_to_words(predict_.T[index])))
if i >= 2:
break
这是我遇到(预测)问题的部分:
saver.restore(sess, MODEL_LOCATION)
batch__ = []
query = get_formatted_sentence(sentence)
batch__.append(words_to_batch(query))
batch, batch_len = batch_method(batch__)
for x in range(0,10): # This is the loop I was talking about
prediction = sess.run(decoder_prediction, feed_dict={encoder_inputs: batch, encoder_inputs_length: batch_len}) # Notice that here I do not give decoder_targets ( If I understand them correctly they are the 'wanted' result. In this case I have no 'wanted' result, I just want to get a prediction
print(prediction.T)
print(batch_to_words(prediction.T[0]))
那就是它。我不知道为什么会这样。任何帮助表示赞赏。我还包括其他方法以获得更多背景信息。
这是next_feed():
def next_feed():
encoder_inputs_, encoder_input_lengths_ = batch_method(batches_query)
decoder_targets_, _ = batch_method(
[sequence + [EOS] + [PAD] for sequence in iter(batches_response)]
)
#print(encoder_inputs_)
#print(encoder_input_lengths_)
return {
encoder_inputs: encoder_inputs_,
encoder_inputs_length: encoder_input_lengths_,
decoder_targets: decoder_targets_,
}
这是batch_method:
def batch_method(inputs, max_sequence_length=None):
"""
Args:
inputs:
list of sentences (integer lists)
max_sequence_length:
integer specifying how large should `max_time` dimension be.
If None, maximum sequence length would be used
Outputs:
inputs_time_major:
input sentences transformed into time-major matrix
(shape [max_time, batch_size]) padded with 0s
sequence_lengths:
batch-sized list of integers specifying amount of active
time steps in each input sequence
"""
sequence_lengths = [len(seq) for seq in inputs]
batch_size_ = len(inputs)
if max_sequence_length is None:
max_sequence_length = max(sequence_lengths)
inputs_batch_major = np.zeros(shape=[batch_size_, max_sequence_length], dtype=np.int32) # == PAD
for index, seq in enumerate(inputs):
for j, element in enumerate(seq):
inputs_batch_major[index, j] = element
# [batch_size, max_time] -> [max_time, batch_size]
inputs_time_major = inputs_batch_major.swapaxes(0, 1)
return inputs_time_major, sequence_lengths