Question

我有一个张量a，我想基于另一个张量l遍历行和索引值。即l表示我需要的向量的长度。

sess = tf.InteractiveSession()

a = tf.constant(np.random.rand(3,4)) # shape=(3,4)
a.eval()

Out:
array([[0.35879311, 0.35347166, 0.31525201, 0.24089784],
       [0.47296348, 0.96773956, 0.61336239, 0.6093023 ],
       [0.42492552, 0.2556728 , 0.86135674, 0.86679779]])

l = tf.constant(np.array([3,2,4])) # shape=(3,)
l.eval()

Out:
array([3, 2, 4])

预期输出：

[array([0.35879311, 0.35347166, 0.31525201]),
 array([0.47296348, 0.96773956]),
 array([0.42492552, 0.2556728 , 0.86135674, 0.86679779])]

棘手的部分是a可以以None作为第一维，因为通常这是通过占位符定义的批量大小。

我不能只使用如下的掩码和条件，因为我需要分别计算每行的方差。

condition = tf.sequence_mask(l, tf.reduce_max(l))
a_true = tf.boolean_mask(a, condition)
a_true

Out:
array([0.35879311, 0.35347166, 0.31525201, 0.47296348, 0.96773956,
   0.42492552, 0.2556728 , 0.86135674, 0.86679779])

我也尝试使用tf.map_fn，但无法使其正常工作。

elems = (a, l)
tf.map_fn(lambda x: x[0][:x[1]], elems)

任何帮助将不胜感激！

Answer 1

TensorArray对象可以存储不同形状的张量。但是，它仍然不是那么简单。让我们看一下这个示例，该示例可以将tf.while_loop()与tf.TensorArray和tf.slice()函数一起使用：

import tensorflow as tf
import numpy as np

batch_data = np.array([[0.35879311, 0.35347166, 0.31525201, 0.24089784],
                       [0.47296348, 0.96773956, 0.61336239, 0.6093023 ],
                       [0.42492552, 0.2556728 , 0.86135674, 0.86679779]])
batch_idx = np.array([3, 2, 4]).reshape(-1, 1)

x = tf.placeholder(tf.float32, shape=(None, 4))
idx = tf.placeholder(tf.int32, shape=(None, 1))

n_items = tf.shape(x)[0]
init_ary = tf.TensorArray(dtype=tf.float32,
                          size=n_items,
                          infer_shape=False)
def _first_n(i, ta):
    ta = ta.write(i, tf.slice(input_=x[i],
                              begin=tf.convert_to_tensor([0], tf.int32),
                              size=idx[i]))
    return i+1, ta

_, first_n = tf.while_loop(lambda i, ta: i < n_items,
                           _first_n,
                           [0, init_ary])
first_n = [first_n.read(i)                      # <-- extracts the tensors
           for i in range(batch_data.shape[0])] #     that you're looking for

with tf.Session() as sess:
    res = sess.run(first_n, feed_dict={x:batch_data, idx:batch_idx})
    print(res)
    # [array([0.3587931 , 0.35347167, 0.315252  ], dtype=float32),
    #  array([0.47296348, 0.9677396 ], dtype=float32),
    #  array([0.4249255 , 0.2556728 , 0.86135674, 0.8667978 ], dtype=float32)]

注意

我们仍然必须使用batch_size通过first_n方法从TensorArray read()提取元素。我们不能使用其他任何返回Tensor的方法，因为我们有不同大小的行（TensorArray.concat方法除外，但它将返回一维中堆叠的所有元素）。
如果TensorArray的元素少于传递给TensorArray.read(index)的索引的元素，您将获得InvalidArgumentError。
您不能使用tf.map_fn，因为它返回的张量必须具有相同形状的所有元素。

如果您只需要计算每行的前n个元素的方差（而实际上并没有收集不同大小的元素），则此任务会更简单。在这种情况下，我们可以直接计算切片张量的方差，将其放入TensorArray，然后将其堆叠到张量：

n_items = tf.shape(x)[0]
init_ary = tf.TensorArray(dtype=tf.float32,
                          size=n_items,
                          infer_shape=False)
def _variances(i, ta, begin=tf.convert_to_tensor([0], tf.int32)):
    mean, varian = tf.nn.moments(
        tf.slice(input_=x[i], begin=begin, size=idx[i]),
        axes=[0]) # <-- compute variance
    ta = ta.write(i, varian) # <-- write variance of each row to `TensorArray`
    return i+1, ta


_, variances = tf.while_loop(lambda i, ta: i < n_items,
                             _variances,
                             [ 0, init_ary])
variances = variances.stack() # <-- read from `TensorArray` to `Tensor`
with tf.Session() as sess:
    res = sess.run(variances, feed_dict={x:batch_data, idx:batch_idx})
    print(res) # [0.0003761  0.06120085 0.07217039]

循环使用第二张量值在张量维度0（NoneType）上循环

1 个答案: