我正在尝试将Tensorflow的官方基本word2vec实现转换为使用tf.Estimator。
问题在于,使用Tensorflow Estimators时,损失函数(sampled_softmax_loss
或nce_loss
)给出错误。在原始实现中,它工作得很好。
这是Tensorflow的官方基本word2vec实现:
这是我实施此代码的Google Colab笔记本,正在运行。
https://colab.research.google.com/drive/1nTX77dRBHmXx6PEF5pmYpkIVxj_TqT5I
在这里是Google Colab笔记本,在这里我更改了代码,以使其使用Tensorflow Estimator,但无法正常工作。
https://colab.research.google.com/drive/1IVDqGwMx6BK5-Bgrw190jqHU6tt3ZR3e
为方便起见,以下是上面我定义model_fn
batch_size = 128
embedding_size = 128 # Dimension of the embedding vector.
skip_window = 1 # How many words to consider left and right.
num_skips = 2 # How many times to reuse an input to generate a label.
num_sampled = 64 # Number of negative examples to sample.
def my_model( features, labels, mode, params):
with tf.name_scope('inputs'):
train_inputs = features
train_labels = labels
with tf.name_scope('embeddings'):
embeddings = tf.Variable(
tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
embed = tf.nn.embedding_lookup(embeddings, train_inputs)
with tf.name_scope('weights'):
nce_weights = tf.Variable(
tf.truncated_normal(
[vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
with tf.name_scope('biases'):
nce_biases = tf.Variable(tf.zeros([vocabulary_size]))
with tf.name_scope('loss'):
loss = tf.reduce_mean(
tf.nn.nce_loss(
weights=nce_weights,
biases=nce_biases,
labels=train_labels,
inputs=embed,
num_sampled=num_sampled,
num_classes=vocabulary_size))
tf.summary.scalar('loss', loss)
if mode == "train":
with tf.name_scope('optimizer'):
optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=optimizer)
这是我称之为估算器和训练的地方
word2vecEstimator = tf.estimator.Estimator(
model_fn=my_model,
params={
'batch_size': 16,
'embedding_size': 10,
'num_inputs': 3,
'num_sampled': 128,
'batch_size': 16
})
word2vecEstimator.train(
input_fn=generate_batch,
steps=10)
这是我致电Estimator培训时收到的错误消息:
INFO:tensorflow:Calling model_fn.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-22-955f44867ee5> in <module>()
1 word2vecEstimator.train(
2 input_fn=generate_batch,
----> 3 steps=10)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
352
353 saving_listeners = _check_listeners_type(saving_listeners)
--> 354 loss = self._train_model(input_fn, hooks, saving_listeners)
355 logging.info('Loss for final step: %s.', loss)
356 return self
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in _train_model(self, input_fn, hooks, saving_listeners)
1205 return self._train_model_distributed(input_fn, hooks, saving_listeners)
1206 else:
-> 1207 return self._train_model_default(input_fn, hooks, saving_listeners)
1208
1209 def _train_model_default(self, input_fn, hooks, saving_listeners):
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in _train_model_default(self, input_fn, hooks, saving_listeners)
1235 worker_hooks.extend(input_hooks)
1236 estimator_spec = self._call_model_fn(
-> 1237 features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
1238 global_step_tensor = training_util.get_global_step(g)
1239 return self._train_with_estimator_spec(estimator_spec, worker_hooks,
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in _call_model_fn(self, features, labels, mode, config)
1193
1194 logging.info('Calling model_fn.')
-> 1195 model_fn_results = self._model_fn(features=features, **kwargs)
1196 logging.info('Done calling model_fn.')
1197
<ipython-input-20-9d389437162a> in my_model(features, labels, mode, params)
33 inputs=embed,
34 num_sampled=num_sampled,
---> 35 num_classes=vocabulary_size))
36
37 # Add the loss value as a scalar to summary.
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py in nce_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
1246 remove_accidental_hits=remove_accidental_hits,
1247 partition_strategy=partition_strategy,
-> 1248 name=name)
1249 sampled_losses = sigmoid_cross_entropy_with_logits(
1250 labels=labels, logits=logits, name="sampled_losses")
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name, seed)
1029 with ops.name_scope(name, "compute_sampled_logits",
1030 weights + [biases, inputs, labels]):
-> 1031 if labels.dtype != dtypes.int64:
1032 labels = math_ops.cast(labels, dtypes.int64)
1033 labels_flat = array_ops.reshape(labels, [-1])
TypeError: data type not understood
编辑:根据要求,这是input_fn的典型输出如下
print(generate_batch(batch_size=8, num_skips=2, skip_window=1))
(array([3081, 3081, 12, 12, 6, 6, 195, 195], dtype=int32), array([[5234],
[ 12],
[ 6],
[3081],
[ 12],
[ 195],
[ 6],
[ 2]], dtype=int32))
答案 0 :(得分:3)
您在此处像变量一样使用generate_batch
:
word2vecEstimator.train(
input_fn=generate_batch,
steps=10)
使用generate_batch()
调用该函数。
但是我认为您必须将一些值传递给该函数。
答案 1 :(得分:0)
张量和操作可能必须在input_fn
中,而不是在'model_fn'中
我发现此问题#4026解决了我的问题...也许只是我很愚蠢,但是如果您提到所有张量和操作都必须在文档中某个地方的input_fn内,那就太好了。< / p>
您必须从input_fn内部的某个地方调用read_batch_examples,以便它创建的张量在Estimator在fit()中创建的图中。
https://github.com/tensorflow/tensorflow/issues/8042
哦,我觉得自己是个白痴!我一直在图范围之外创建操作。现在可以正常工作,简直不敢相信我没有尝试过。非常感谢!这不是问题,已经解决
https://github.com/tensorflow/tensorflow/issues/4026
但是,关于导致问题的原因的信息仍然不足。这只是一个线索。
答案 2 :(得分:0)
找到答案
错误清楚地表明您的标签类型无效。
您尝试传递numpy数组而不是Tensor。有时Tensorflow 在后台执行从ndarray到Tensor的隐式转换 (这就是为什么您的代码在Estimator之外运行的原因),但在这种情况下, 不要。
。
不,官方暗示。从占位符提供数据。占位符为 总是张量,所以它不依赖于隐式事物。
但是如果您直接使用numpy数组作为输入来调用损失函数 (注意:在图构建阶段调用,因此参数内容 嵌入到图表中),它可能会起作用(但是,我没有对其进行检查)。
此代码:
nce_loss(labels = [1,2,3])在图表期间仅被调用一次 施工。标签将作为 常量,并且可以是任何与Tensor兼容的类型(列表, ndarray等)
这段代码:```Python def model(label_input): nce_loss(labels = label_input)
estimator(model_fun = model).train()```无法嵌入标签变量 静态的,因为它的内容未在图形期间定义 施工。因此,如果您喂张量以外的任何东西,它将抛出 错误。
来自
所以我使用了labels=tf.dtypes.cast( train_labels, tf.int64)
,它起作用了