Question

在下面的代码中，绝对要我在GPU中执行完整功能，而不必跳回到CPU。这是因为我有4个CPU内核，但我有1200个cuda内核。从理论上讲，这是可能的，因为tensorflow feed_forwards，if语句和变量赋值可以在GPU（我拥有NVIDIA GTX 1060）上完成。

我面临的问题是tensorflow2.0会在后端自动分配给GPU和CPU，并且没有提到哪个操作与GPU兼容。当我将设备作为GPU运行以下功能时，我得到了

parallel_func could not be transformed and will be staged without change.

并且它在GPU上顺序运行。

我的问题是在哪里使用tf.device？签名的哪一部分代码将转换为GPU代码，哪些将保留在CPU上？如何将其也转换为GPU？

@tf.function
def parallel_func(self):
    for i in tf.range(114):                     #want this parallel on GPU
        for count in range(320):                #want this sequential on GPU

            retrivedValue = self.data[i][count]

            if self.var[i]==1:
                self.value[i] = retrievedValue     # assigns, if else
            elif self.var[i]==-1:                  # some links to class data through
                self.value[i] = -retrivedValue     # self.data, self.a and self.b

            state = tf.reshape(tf.Variable([self.a[i], self.b[i][count]]), [-1,2])

            if self.workerSwitch == False:
                action = tf.math.argmax(self.feed_forward(i, count, state))
            else:
                action = tf.math.argmax(self.worker_feed_forward(i, count, state))

            if (action==1 or action==-1):
                self.actionCount +=1

Answer 1

旁注：消息parallel_func could not be transformed and will be staged without change由亲笔签名输出，由于它包含依赖于数据的控制流，因此该函数可能根本无法运行。值得向issue提交包含重现步骤和更详细的日志消息的文件。

如何使用带有tf.function包裹类方法的签名和tf.device？

1 个答案: