Question

我想读取以tfrecord格式保存的时间序列。每个时间序列都有不同的长度。我要实现的是将一个长张量拆分为一批要求长度的较小张量。使用numpy数组非常容易，它看起来像这样：

length = 200
for begin in range(tensor_size-length):
     tensor_slice = tf.slice(my_tensor, begin, length)
     my_slices.append(tensor_slice)

在这样的函数中，我的问题是：如何获取张量的大小，以便可以使用循环？以下是示例代码的读取和解码部分。

file_queue = tf.train.string_input_producer(tf_files, num_epochs=num_epochs)
        reader = tf.TFRecordReader()
        _, serialized_records = reader.read(file_queue)

        feature_map = {
            "speed":tf.FixedLenSequenceFeature([], tf.float32, allow_missing=True),
            "battery":tf.FixedLenSequenceFeature([], tf.float32, allow_missing=True)
        }

        features = tf.parse_single_example(serialized_records, feature_map)
        speed = tf.cast(features['speed'], tf.float32)
        battery = tf.cast(features['battery'], tf.float32)
        speeds = []
        batteries = []

        #SPLIT TENSOR INTO SMALLER TENSORS
        features = tf.train.shuffle_batch([speeds, batteries],
                                           batch_size=batch_size,
                                           capacity=5000,
                                           num_threads=4,
                                           min_after_dequeue=1)

        return features

Answer 1

您无法像Python中那样遍历张量。您可以使用tf.while_loop，尽管通常会避免使用它，除非它确实是实现所需目标的唯一方法，因为它通常很慢。就您而言，您可以不循环而获得所需的结果，例如使用tf.gather：

length = 200
features = ...
# Number of elements
n = tf.shape(features)[0]
# Index from zero to number of subtensors
split_idx = tf.range(n - length + 1)
# Index from zero to subtensor length
length_idx = tf.range(length)
# Indices for gather; each row advances one position, like a "rolling window"
gather_idx = split_idx[:, tf.newaxis] + length_idx
# Gather result
features_split = tf.gather(features, gather_idx)

当张量的大小未知时是否可以循环？

1 个答案: