Question

我试图让TensorFlow的random_poisson功能在我的GPU上运行;鉴于this TensorFlow源页面有一个函数testCPUGPUMatch，可以比较random_poisson在CPU和GPU上运行时的输出，看起来应该是可能的。但是，在使用代码进行测试时：

import tensorflow as tf

with tf.Session() as sess:
    with tf.device("/gpu:0"):
        test = sess.run(tf.random_poisson(1.0, [], dtype=tf.float64))
print(test)

我收到错误：

InvalidArgumentError：无法分配设备进行操作   ＆＃39; random_poisson / RandomPoissonV2＆＃39;：无法满足显式设备   规范＆＃39; /设备：GPU：0＆＃39;因为没有支持GPU的内核   设备可用。已注册的内核：device =＆＃39; CPU＆＃39 ;; R in   [DT_INT64]; dtype在[DT_INT64] device =＆＃39; CPU＆＃39 ;; R在[DT_INT64]中; D型   在[DT_INT32] device =＆＃39; CPU＆＃39;; R在[DT_INT64]中; dtype在[DT_DOUBLE]中   设备=＆＃39; CPU＆＃39 ;; R在[DT_INT64]中; dtype在[DT_FLOAT]设备中=＆＃39; CPU＆＃39 ;; [R   在[DT_INT64]; dtype在[DT_HALF]设备中=＆＃39; CPU＆＃39 ;; R在[DT_INT32]中;   dtype在[DT_INT64] device =＆＃39; CPU＆＃39 ;; R在[DT_INT32]中; dtype in   [DT_INT32] device =＆＃39; CPU＆＃39 ;; R在[DT_INT32]中; dtype在[DT_DOUBLE]中   设备=＆＃39; CPU＆＃39 ;; R在[DT_INT32]中; dtype在[DT_FLOAT]设备中=＆＃39; CPU＆＃39 ;; [R   在[DT_INT32]; dtype在[DT_HALF]设备中=＆＃39; CPU＆＃39 ;; R在[DT_DOUBLE]中;   dtype在[DT_INT64] device =＆＃39; CPU＆＃39 ;; R在[DT_DOUBLE]中; dtype in   [DT_INT32] device =＆＃39; CPU＆＃39 ;; R在[DT_DOUBLE]中; dtype在[DT_DOUBLE]中   设备=＆＃39; CPU＆＃39 ;; R在[DT_DOUBLE]中; dtype在[DT_FLOAT]设备中=＆＃39; CPU＆＃39 ;; [R   在[DT_DOUBLE]; dtype在[DT_HALF]设备中=＆＃39; CPU＆＃39 ;; R在[DT_FLOAT]中;   dtype在[DT_INT64] device =＆＃39; CPU＆＃39 ;; R在[DT_FLOAT]中; dtype in   [DT_INT32] device =＆＃39; CPU＆＃39 ;; R在[DT_FLOAT]中; dtype在[DT_DOUBLE]中   设备=＆＃39; CPU＆＃39 ;; R在[DT_FLOAT]中; dtype在[DT_FLOAT]设备中=＆＃39; CPU＆＃39 ;; [R   在[DT_FLOAT]; dtype在[DT_HALF]设备中=＆＃39; CPU＆＃39 ;; R在[DT_HALF];   dtype在[DT_INT64] device =＆＃39; CPU＆＃39 ;; R在[DT_HALF]; dtype in   [DT_INT32] device =＆＃39; CPU＆＃39 ;; R在[DT_HALF]; dtype在[DT_DOUBLE]中   设备=＆＃39; CPU＆＃39 ;; R在[DT_HALF]; dtype在[DT_FLOAT]设备中=＆＃39; CPU＆＃39 ;; R in   [DT_HALF]; dtype在[DT_HALF]
中
[[节点：random_poisson / RandomPoissonV2 =   RandomPoissonV2 [R = DT_DOUBLE，S = DT_INT32，dtype = DT_DOUBLE，seed = 0，   seed2 = 0，_device =＆＃34; / device：GPU：0＆＃34;]（random_poisson / shape，   random_poisson / RandomPoissonV2 /速率）]]

列出了没有注册的GPU内核。在我的CPU上运行时，代码的行为与预期的一样，在我的GPU上运行时代码与uniform_random类似。我在某种程度上错过了random_poisson的GPU内核吗？一个不存在，即使链接的源页面暗示一个吗？如果一个不存在，是否有一个在GPU上运行的实现？这是目前我实施的一个相当复杂的模型的瓶颈，因此修复它很好。我在Arch Linux上的Python 3.6.4上运行版本1.8.0的TensorFlow（从pip安装），在GeForce GTX 1050上运行CUDA版本9.0和cuDNN版本7.0。

谢谢！

Answer 1

没有用于随机泊松的GPU内核。在random_poisson_op.cc中，仅注册了CPU内核。

如果您查看链接的testCPUGPUMatch代码，它将调用self.test_session(use_gpu=True, ...)，如果可能的话，该代码会尝试在GPU上运行。在后台，test_session使用allow_soft_placement来执行此操作，如果无法在GPU上运行op，则会退回到CPU。因此，测试实际上在CPU上运行op。

[放在一边：为什么我们要进行的测试实际上并没有做任何事情？看来testCPUGPUMatch是对许多不同随机操作的测试（请参阅test_random_ops.py），RandomPoisson的实现者可能已出于这个原因添加了它。似乎最好将其命名为testCPUGPUMatchIfGPUImplementationExists ...]

可以通过github问题随意提交对GPU内核的功能请求，或提交带有实现的PR。

Answer 2

import tensorflow as tf
config = tf.ConfigProto(allow_soft_placement=True)

with tf.Session(config=config) as sess:
    with tf.device("/gpu:0"):
        test = sess.run(tf.random_poisson(1.0, [], dtype=tf.float64))
print(test)

TensorFlow的random_poisson仅在CPU

2 个答案: