Question

我有一系列一维时间序列，这些时间序列通过一系列卷积层最终以以下形式出现：

(batch_size, time_series_length, num_filters)

我想通过插入交替的零点（非常像转置的卷积）来手动对张量进行升采样，以使新的维数变为

(batch_size, 2*time_series_length, num_filters)

，以便能够在卷积层之前包含一个附加步骤。例如，使用np.insert在numpy中执行此操作很简单，但是如何使用张量做到这一点呢？

我看过一些类似的文章，例如this，但我不知道如何在保留其他维度的同时对多个维度进行此操作。有什么想法吗？

Answer 1

简短的答案是：使用tf.scatter_nd

棘手的部分是构建此操作的索引。以下代码示例显示了如何对具有任意多个尺寸的张量执行此操作。

import itertools
import numpy as np
import tensorflow as tf


def pad_strided(x, strides, name=None):
    # Preparatory steps and sanity checks.
    input_shape = x.shape.as_list()
    # Because life gets easier, we let the consumer specify a striding value for EACH dimension
    assert len(strides) == len(input_shape), "Rank of strides and x.shape must be the same"
    output_shape = [s_in * s for s_in, s in zip(input_shape, strides)]

    """
    Calculate the striding indices for EACH dimension.
    """
    index_ranges = [list(range(0, s_out, s)) for s_out, s in zip(output_shape, strides)]
    """
    Expand the indices per dimension. The resulting array has shape [n_elements, n_dims].
    n_elements is the number of values in the input tensor x. So the product of the input
    shape. n_dims is the number of input (and output) dimensions.
    """
    indices_flat = np.array(list(itertools.product(*index_ranges)))

    """
    Reshape the flat index array to have the same dimensions as the input plus an additional
    dimension. If the input had [s0, s1, ..., sn], then indices will have
    [s0, s1, ..., sn, n_dims]. I.e. the rank will be 1 higher than that of the input tensor.
    """
    indices = np.reshape(indices_flat, input_shape + [-1])

    """ Now we simply call the TensorFlow operator """
    with tf.variable_scope(name, default_name="pad_strided"):
        t_indices = tf.constant(indices, dtype=tf.int32, name="indices")
        t_output_shape = tf.constant(output_shape, name="output_shape")
        return tf.scatter_nd(t_indices, x, t_output_shape)


session = tf.Session()
batch_size = 1
time_series_length = 6
num_filters = 3
t_in = tf.random.uniform((batch_size, time_series_length, num_filters))
# Specify a stride 2 for the time_series dimension
t_out = pad_strided(t_in, strides=[1, 2, 1])
original, strided = session.run([t_in, t_out])
print(f"Input Tensor:\n{original[:,:,:]}")
print(f"Output Tensor:\n{strided[:,:,:]}")

输出将例如

Input Tensor:
[[[0.0678339  0.07883668 0.49193358]
  [0.5029118  0.8639555  0.74302936]
  [0.995087   0.6315181  0.11990702]
  [0.95606446 0.29059124 0.12656784]
  [0.8278991  0.8518325  0.4033165 ]
  [0.78434443 0.7894305  0.6251142 ]]]
Output Tensor:
[[[0.0678339  0.07883668 0.49193358]
  [0.         0.         0.        ]
  [0.5029118  0.8639555  0.74302936]
  [0.         0.         0.        ]
  [0.995087   0.6315181  0.11990702]
  [0.         0.         0.        ]
  [0.95606446 0.29059124 0.12656784]
  [0.         0.         0.        ]
  [0.8278991  0.8518325  0.4033165 ]
  [0.         0.         0.        ]
  [0.78434443 0.7894305  0.6251142 ]
  [0.         0.         0.        ]]]

Answer 2

我正在研究图像的类似问题。我想从batch, height, width, in_channels转到batch, 2*height, 2*width, in_channels。就像您说的那样，这非常像转置的卷积，所以我最终将tf.nn.conv2d_transpose与strides=2和filters=tf.ones([1, 1, 1, 1])一起使用：

upsampled_images = tf.nn.conv2d_transpose(images, tf.ones([1, 1, 1, 1]), output_shape, strides=2, padding='VALID')

这很好用，所以我认为只要将tf.nn.conv1d_transpose与filters=tf.ones([1, 1, 1])一起使用，一维也将适用。

我知道这个问题很旧，您可能已经想出了一个办法，但是我本人一直在寻找答案，所以它可能会对其他人有所帮助。

通过零插入进行多个维度的Tensorflow上采样

2 个答案: