我想读取以tfrecord格式保存的时间序列。每个时间序列都有不同的长度。我要实现的是将一个长张量拆分为一批要求长度的较小张量。使用numpy数组非常容易,它看起来像这样:
length = 200
for begin in range(tensor_size-length):
tensor_slice = tf.slice(my_tensor, begin, length)
my_slices.append(tensor_slice)
在这样的函数中,我的问题是:如何获取张量的大小,以便可以使用循环? 以下是示例代码的读取和解码部分。
file_queue = tf.train.string_input_producer(tf_files, num_epochs=num_epochs)
reader = tf.TFRecordReader()
_, serialized_records = reader.read(file_queue)
feature_map = {
"speed":tf.FixedLenSequenceFeature([], tf.float32, allow_missing=True),
"battery":tf.FixedLenSequenceFeature([], tf.float32, allow_missing=True)
}
features = tf.parse_single_example(serialized_records, feature_map)
speed = tf.cast(features['speed'], tf.float32)
battery = tf.cast(features['battery'], tf.float32)
speeds = []
batteries = []
#SPLIT TENSOR INTO SMALLER TENSORS
features = tf.train.shuffle_batch([speeds, batteries],
batch_size=batch_size,
capacity=5000,
num_threads=4,
min_after_dequeue=1)
return features
答案 0 :(得分:1)
您无法像Python中那样遍历张量。您可以使用tf.while_loop
,尽管通常会避免使用它,除非它确实是实现所需目标的唯一方法,因为它通常很慢。就您而言,您可以不循环而获得所需的结果,例如使用tf.gather
:
length = 200
features = ...
# Number of elements
n = tf.shape(features)[0]
# Index from zero to number of subtensors
split_idx = tf.range(n - length + 1)
# Index from zero to subtensor length
length_idx = tf.range(length)
# Indices for gather; each row advances one position, like a "rolling window"
gather_idx = split_idx[:, tf.newaxis] + length_idx
# Gather result
features_split = tf.gather(features, gather_idx)