tf.nn.sampled_softmax_loss无法在TPU上运行

时间:2019-01-20 20:03:16

标签: tensorflow

TPU未加载'tf.nn.sampled_softmax_loss'。在TPU上还有其他选择可以运行W2V模型吗?

谢谢

错误日志:

InvalidArgumentError (see above for traceback): Compilation failure: Detected unsupported operations when trying to compile graph _functionalize_body_1[] on XLA_TPU_JIT: LogUniformCandidateSampler (No registered 'LogUniformCandidateSampler' OpKernel for XLA_TPU_JIT devices compatible with node node sampled_softmax_loss/LogUniformCandidateSampler (defined at <ipython-input-41-b631edb0b916>:37)  = LogUniformCandidateSampler[num_sampled=8, num_true=1, range_max=4, seed=0, seed2=0, unique=true, _device="/device:TPU_REPLICATED_CORE"](sampled_softmax_loss/Cast)
    .  Registered:  device='CPU'
)node sampled_softmax_loss/LogUniformCandidateSampler (defined at <ipython-input-41-b631edb0b916>:37) 
     [[node LoopCond (defined at /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/tpu/python/tpu/training_loop.py:169)  = While[T=[DT_INT32, DT_FLOAT, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE, DT_RESOURCE], body=_functionalize_body_1[], cond=_functionalize_cond_1[]](Const, Const_1, shared_weights_b_adam_0_arg, shared_weights_b_adam_1_0_arg, shared_weights_final_embeddings_adam_0_arg, shared_weights_final_embeddings_adam_1_0_arg, beta1_power_0_arg, beta2_power_0_arg, shared_weights_w_adam_0_arg, shared_weights_w_adam_1_0_arg, tpu_estimator_iterations_per_loop_0_arg, shared_weights_final_embeddings_0_arg, shared_weights_w_0_arg, shared_weights_b_0_arg)]]
    TPU compilation failed
     [[{{node tpu_compile_succeeded_assert/_9718983567481127480/_15}} = TPUCompileSucceededAssert[_device="/job:tpu_worker/replica:0/task:0/device:CPU:0"](TPUReplicate/_compile/_14175807995213759933/_14)]]
     [[{{node tpu_compile_succeeded_assert/_9718983567481127480/_15_G252}} = _Recv[client_terminated=false, recv_device="/job:tpu_worker/replica:0/task:0/device:TPU:0", send_device="/job:tpu_worker/replica:0/task:0/device:CPU:0", send_device_incarnation=496949023105919083, tensor_name="edge_174_tpu_compile_succeeded_assert/_9718983567481127480/_15", tensor_type=DT_FLOAT, _device="/job:tpu_worker/replica:0/task:0/device:TPU:0"]()]]

编辑1:

找到原因。

https://cloud.google.com/tpu/docs/troubleshooting#common-errors

不可用的TensorFlow op 错误消息

NotFoundError:与节点兼容的XLA_TPU_JIT设备没有注册的“ OpName” OpKernel

详细信息

该模型使用的是TensorFlow op,目前在TPU上不可用。

有关TPU上可用操作的列表,以及未来支持计划和变通办法的建议,请参阅可用TensorFlow操作指南。

0 个答案:

没有答案