Question

我需要使用独立计算矩阵行的函数的 GPU 并行执行循环。我使用的是 map_fn，但为了能够通过 Eager Execution 启用并行计算，据我所知，我必须使用 while_loop 函数。不幸的是，我发现如何使用这个函数不是很直观，所以我恳请您如何在我的代码中将 map_fn 转换为 while_loop。这是代码的简化版本：

*some 1-D float tensors*

def compute_row(ithStep):
    *operations on the 1-D tensors that return a 1-D tensor with fixed length*
    return values

image = tf.map_fn(compute_row, tf.range(0,nRows))

我编写的带有 while_loop 的版本，遵循文档中的示例和 Stackoverflow 上的其他问题是：

*some 1-D float tensors*

def compute_row(i):
    *operations on the 1-D tensors that return a 1-D tensor with fixed length*
    return values

def condition(i):
    return tf.less(i, nRows)

i = tf.constant(0)

image = tf.while_loop(condition, compute_row, [i])

但在这种情况下，我得到的是：

ValueError: The two structures don't have the same nested structure.

First structure: type=list str=[TensorSpec(shape=(), dtype=tf.int32, name=None)]

Second structure: type=list ... *a long list of tensors*

错在哪里？提前致谢。如果需要，我可以提供简化的可运行代码。

编辑：在可运行代码下方添加

import numpy
import tensorflow as tf
from matplotlib import pyplot

#Defining the data which normally are loaded from file:
#1- matrix of x position-time values, with weights, in sparse format
matrix = numpy.random.randint(2, size = 100).astype(float).reshape(10,10)
x = numpy.nonzero(matrix)[0]
times = numpy.nonzero(matrix)[1]
weights = numpy.random.rand(x.size)

#2- array of y positions
nStepsY = 5
y = numpy.arange(1,nStepsY+1)

#3- the size of the final matrix
nRows = nStepsY
nColumns = 80

# Building the TF tensors
x = tf.constant(x, dtype = tf.float32)
times = tf.constant(times, dtype = tf.float32)
weights = tf.constant(weights, dtype = tf.float32)
y = tf.constant(y, dtype = tf.float32)

# the function to iterate
def compute_row(i):
    yTimed = tf.multiply(y[i],times)
    positions = tf.round((x-yTimed)+50)
    positions = tf.cast(positions, dtype=tf.int32)
    values = tf.math.unsorted_segment_sum(weights, positions, nColumns)
    return values

image = tf.map_fn(compute_row, tf.range(0,nRows), dtype=tf.float32)

%matplotlib inline
pyplot.imshow(image, aspect = 10)
pyplot.colorbar(shrink = 0.75,aspect = 10)

输出图像为：

Answer 1

要构造一个while循环，需要定义两个函数：

条件函数：当这个函数返回false时，循环停止
执行所需操作的循环体函数。在您的情况下，因为您想构建一个张量，您可以将其视为一个累积函数：该函数将张量作为参数，并在末尾追加一个新行。

知道了，我们可以定义两个函数：

首先是循环体。让我们重用 compute_row 函数根据 i 的值计算新行的值，并使用 tf.concat 将新行附加到我们的累加器。我们通过向新行添加一维来确保形状与串联兼容。我们还将计数器 i 的值增加 1。

def loop_body(i, accumulator):
    new_row = compute_row(i)
    accumulator = tf.concat([accumulator, new_row[tf.newaxis,:]],axis=0)
    return i+1, accumulator

接下来的条件：在这种情况下，我们只需要检查 i 的值是否不大于所需的行数。

def cond(i,accumulator):
    return tf.less(i,nRows)

请注意，loop_body 和 cond 两个函数必须具有相同的签名。（这就解释了为什么 cond 需要第二个未使用的参数）。

现在，我们可以将其放在 while_loop 调用中：

i0 = tf.constant(0) # we initialize the counter i at 0
# we initialize the accumulator with an empty Tensor of dimension 1 equal to 0. 
accumulator = tf.zeros((0, nColumns)) 
final_i, image = tf.while_loop(cond, loop_body, loop_vars=[i0, accumulator])

为了确保它重现与 map_fn 版本相同的值，我们可以比较两个结果：

>>> image_map = tf.map_fn(compute_row_map, tf.range(0, nRows), dtype=tf.float32)
>>> tf.reduce_all(tf.equal(image, image_map))
<tf.Tensor: shape=(), dtype=bool, numpy=True>

使用 tf.while_loop 和一维张量作为输入来产生二维张量

1 个答案: