Question

我试图在张量流1.7rc1中以急切模式使用GPU加速，但是我在各种tf函数中遇到NotFoundError。以下是我的简单测试：

import tensorflow as tf
import tensorflow.contrib.eager as tfe
import math
import numpy as np

tfe.enable_eager_execution()

num_sampled=64
vocabulary_size = 512
embedding_size = 128

train_dataset = tf.constant(np.array([1,3,4,5,4]))
train_labels = tf.constant(tf.transpose(np.array([[1,2,1,2,0]])))

embeddings = tfe.Variable(tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
softmax_weights = tfe.Variable(tf.truncated_normal([vocabulary_size, embedding_size],
                     stddev=1.0 / math.sqrt(embedding_size)))
softmax_biases = tfe.Variable(tf.zeros([vocabulary_size]))

with tf.device('/gpu:0'):
    embed = tf.nn.embedding_lookup(embeddings, train_dataset) #the ID can be a list of words
    loss = tf.reduce_mean(tf.nn.sampled_softmax_loss(weights=softmax_weights, biases=softmax_biases, inputs=embed,
                                   labels=train_labels, num_sampled=num_sampled, num_classes=vocabulary_size))
    print(loss)

我收到以下错误：

---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
<ipython-input-4-f6ef06e0fbcf> in <module>()
     21     embed = tf.nn.embedding_lookup(embeddings, train_dataset) #the ID can be a list of words
     22     loss = tf.reduce_mean(tf.nn.sampled_softmax_loss(weights=softmax_weights, biases=softmax_biases, inputs=embed,
---> 23                                    labels=train_labels, num_sampled=num_sampled, num_classes=vocabulary_size))
     24     print(loss)

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_impl.py in sampled_softmax_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name, seed)
   1340       partition_strategy=partition_strategy,
   1341       name=name,
-> 1342       seed=seed)
   1343   sampled_losses = nn_ops.softmax_cross_entropy_with_logits(
   1344       labels=labels, logits=logits)

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name, seed)
   1039           unique=True,
   1040           range_max=num_classes,
-> 1041           seed=seed)
   1042     # NOTE: pylint cannot tell that 'sampled_values' is a sequence
   1043     # pylint: disable=unpacking-non-sequence

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\candidate_sampling_ops.py in log_uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed, name)
    139   return gen_candidate_sampling_ops.log_uniform_candidate_sampler(
    140       true_classes, num_true, num_sampled, unique, range_max, seed=seed1,
--> 141       seed2=seed2, name=name)
    142 
    143 

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_candidate_sampling_ops.py in log_uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed, seed2, name)
    755       else:
    756         message = e.message
--> 757       _six.raise_from(_core._status_to_exception(e.code, message), None)
    758 
    759 

C:\ProgramData\Anaconda3\lib\site-packages\six.py in raise_from(value, from_value)

NotFoundError: No registered 'LogUniformCandidateSampler' OpKernel for GPU devices compatible with node LogUniformCandidateSampler = LogUniformCandidateSampler[num_sampled=64, num_true=1, range_max=512, seed=0, seed2=0, unique=true](dummy_input)
    .  Registered:  device='CPU'
 [Op:LogUniformCandidateSampler]

有人可以帮忙吗？当我使用tf.device('\cpu:0')时，代码似乎正常运行。我在Windows 10上使用tensorflow 1.7rc1。非常感谢你！

Answer 1

这里发生的事情是你明确要求在GPU上执行损失计算（通过在tf.nn. sampled_softmax_loss块中调用with tf.device("/gpu:0")），但它不能GPU没有实现底层操作。此错误并非特定于急切执行，您也会在执行图表时遇到相同的错误（例如，使用with tf.Session() as sess: print(sess.run(loss))）

如果您将代码的结构设置为对单个操作应执行的位置不那么规定，那么TensorFlow可以灵活地在可能的情况下运行操作。例如，将损失计算移到with tf.device块之外：

with tf.device('/gpu:0'):
    embed = tf.nn.embedding_lookup(embeddings, train_dataset) #the ID can be a list of words
loss = tf.reduce_mean(tf.nn.sampled_softmax_loss(weights=softmax_weights, biases=softmax_biases, inputs=embed,
                              labels=train_labels, num_sampled=num_sampled, num_classes=vocabulary_size))
print(loss)

旁注：在TensorFlow 1.7中，急切执行不使用没有明确with tf.device("/gpu:0")的GPU。但是，使用next version of TensorFlow，即使没有明确的设备规范，也会使用GPU（tensorflow/tensorflow#14133中的更多详细信息）。所以，如果你能够，我建议使用1.8.0-rc0而不是1.7.0-rc1然后你可以完全取消with tf.device("/gpu:0")。

希望有所帮助。

在tensorflow eager模式下使用tf.nn.embedding_lookup时，OpKernel上的NotFoundError

1 个答案: