我正在尝试使用tensorflow训练序列到序列模型,并且一直在查看它们的示例代码。
我希望能够访问编码器创建的矢量嵌入,因为它们似乎有一些有趣的属性。
然而,我真的不清楚这是怎么回事。
在单词示例的矢量表示中,他们谈论了很多关于这些嵌入可以用于什么的内容,然后似乎没有提供一种简单的方法来访问它们,除非我弄错了。
非常感谢任何有关如何访问它们的帮助。
答案 0 :(得分:5)
与所有Tensorflow操作一样,大多数变量都是动态创建的。有不同的方法来访问这些变量(及其值)。在这里,您感兴趣的变量是训练变量集的一部分。要访问这些,我们可以使用tf.trainable_variables()
函数:
for var in tf.trainable_variables():
print var.name
这将为我们提供GRU seq2seq模型,以下列表:
embedding_rnn_seq2seq/RNN/EmbeddingWrapper/embedding:0
embedding_rnn_seq2seq/RNN/GRUCell/Gates/Linear/Matrix:0
embedding_rnn_seq2seq/RNN/GRUCell/Gates/Linear/Bias:0
embedding_rnn_seq2seq/RNN/GRUCell/Candidate/Linear/Matrix:0
embedding_rnn_seq2seq/RNN/GRUCell/Candidate/Linear/Bias:0
embedding_rnn_seq2seq/embedding_rnn_decoder/embedding:0
embedding_rnn_seq2seq/embedding_rnn_decoder/rnn_decoder/GRUCell/Gates/Linear/Matrix:0
embedding_rnn_seq2seq/embedding_rnn_decoder/rnn_decoder/GRUCell/Gates/Linear/Bias:0
embedding_rnn_seq2seq/embedding_rnn_decoder/rnn_decoder/GRUCell/Candidate/Linear/Matrix:0
embedding_rnn_seq2seq/embedding_rnn_decoder/rnn_decoder/GRUCell/Candidate/Linear/Bias:0
embedding_rnn_seq2seq/embedding_rnn_decoder/rnn_decoder/OutputProjectionWrapper/Linear/Matrix:0
embedding_rnn_seq2seq/embedding_rnn_decoder/rnn_decoder/OutputProjectionWrapper/Linear/Bias:0
这告诉我们嵌入被称为embedding_rnn_seq2seq/RNN/EmbeddingWrapper/embedding:0
,我们可以使用它来检索早期迭代器中该变量的指针:
for var in tf.trainable_variables():
print var.name
if var.name == 'embedding_rnn_seq2seq/RNN/EmbeddingWrapper/embedding:0':
embedding_op = var
然后我们可以将其他操作传递给会话运行:
_, loss_t, summary, embedding = sess.run([train_op, loss, summary_op, embedding_op], feed_dict)
我们自己拥有(批量列表)嵌入...
答案 1 :(得分:0)
有一个相关的post,但它基于tensorflow-0.6,这已经过时了。所以我在tensorflow-0.8中更新了他的答案,这也与最新版本中的答案类似。
(*代表修改的地方)
losses = []
outputs = []
*states = []
with ops.op_scope(all_inputs, name, "model_with_buckets"):
for j, bucket in enumerate(buckets):
with variable_scope.variable_scope(variable_scope.get_variable_scope(),
reuse=True if j > 0 else None):
*bucket_outputs, _ ,bucket_states= seq2seq(encoder_inputs[:bucket[0]],
decoder_inputs[:bucket[1]])
outputs.append(bucket_outputs)
if per_example_loss:
losses.append(sequence_loss_by_example(
outputs[-1], targets[:bucket[1]], weights[:bucket[1]],
softmax_loss_function=softmax_loss_function))
else:
losses.append(sequence_loss(
outputs[-1], targets[:bucket[1]], weights[:bucket[1]],
softmax_loss_function=softmax_loss_function))
return outputs, losses, *states
在python / ops / seq2seq,修改embedding_attention_seq2seq()
if isinstance(feed_previous, bool):
*outputs, states = embedding_attention_decoder(
decoder_inputs, encoder_state, attention_states, cell,
num_decoder_symbols, embedding_size, num_heads=num_heads,
output_size=output_size, output_projection=output_projection,
feed_previous=feed_previous,
initial_state_attention=initial_state_attention)
*return outputs, states, encoder_state
# If feed_previous is a Tensor, we construct 2 graphs and use cond.
def decoder(feed_previous_bool):
reuse = None if feed_previous_bool else True
with variable_scope.variable_scope(variable_scope.get_variable_scope(),reuse=reuse):
outputs, state = embedding_attention_decoder(
decoder_inputs, encoder_state, attention_states, cell,
num_decoder_symbols, embedding_size, num_heads=num_heads,
output_size=output_size, output_projection=output_projection,
feed_previous=feed_previous_bool,
update_embedding_for_previous=False,
initial_state_attention=initial_state_attention)
return outputs + [state]
outputs_and_state = control_flow_ops.cond(feed_previous, lambda: decoder(True), lambda: decoder(False))
*return outputs_and_state[:-1], outputs_and_state[-1], encoder_state
在model / rnn / translate / seq2seq_model.py修改init()
if forward_only:
*self.outputs, self.losses, self.states= tf.nn.seq2seq.model_with_buckets(
self.encoder_inputs, self.decoder_inputs, targets,
self.target_weights, buckets, lambda x, y: seq2seq_f(x, y, True),
softmax_loss_function=softmax_loss_function)
# If we use output projection, we need to project outputs for decoding.
if output_projection is not None:
for b in xrange(len(buckets)):
self.outputs[b] = [
tf.matmul(output, output_projection[0]) + output_projection[1]
for output in self.outputs[b]
]
else:
*self.outputs, self.losses, _ = tf.nn.seq2seq.model_with_buckets(
self.encoder_inputs, self.decoder_inputs, targets,
self.target_weights, buckets,
lambda x, y: seq2seq_f(x, y, False),
softmax_loss_function=softmax_loss_function)
在model / rnn / translate / seq2seq_model.py修改步骤()
if not forward_only:
return outputs[1], outputs[2], None # Gradient norm, loss, no outputs.
else:
*return None, outputs[0], outputs[1:], outputs[-1] # No gradient norm, loss, outputs.
完成所有这些后,我们可以通过调用:
来获取编码状态_, _, output_logits, states = model.step(sess, encoder_inputs, decoder_inputs,
target_weights, bucket_id, True)
print (states)
翻译中的。