通过零插入进行多个维度的Tensorflow上采样

时间:2018-07-16 23:52:03

标签: python tensorflow

我有一系列一维时间序列,这些时间序列通过一系列卷积层最终以以下形式出现:

(batch_size, time_series_length, num_filters) 

我想通过插入交替的零点(非常像转置的卷积)来手动对张量进行升采样,以使新的维数变为

(batch_size, 2*time_series_length, num_filters)

,以便能够在卷积层之前包含一个附加步骤。例如,使用np.insert在numpy中执行此操作很简单,但是如何使用张量做到这一点呢?

我看过一些类似的文章,例如this,但我不知道如何在保留其他维度的同时对多个维度进行此操作。有什么想法吗?

2 个答案:

答案 0 :(得分:0)

简短的答案是:使用tf.scatter_nd

棘手的部分是构建此操作的索引。 以下代码示例显示了如何对具有任意多个尺寸的张量执行此操作。

import itertools
import numpy as np
import tensorflow as tf


def pad_strided(x, strides, name=None):
    # Preparatory steps and sanity checks.
    input_shape = x.shape.as_list()
    # Because life gets easier, we let the consumer specify a striding value for EACH dimension
    assert len(strides) == len(input_shape), "Rank of strides and x.shape must be the same"
    output_shape = [s_in * s for s_in, s in zip(input_shape, strides)]

    """
    Calculate the striding indices for EACH dimension.
    """
    index_ranges = [list(range(0, s_out, s)) for s_out, s in zip(output_shape, strides)]
    """
    Expand the indices per dimension. The resulting array has shape [n_elements, n_dims].
    n_elements is the number of values in the input tensor x. So the product of the input
    shape. n_dims is the number of input (and output) dimensions.
    """
    indices_flat = np.array(list(itertools.product(*index_ranges)))

    """
    Reshape the flat index array to have the same dimensions as the input plus an additional
    dimension. If the input had [s0, s1, ..., sn], then indices will have
    [s0, s1, ..., sn, n_dims]. I.e. the rank will be 1 higher than that of the input tensor.
    """
    indices = np.reshape(indices_flat, input_shape + [-1])

    """ Now we simply call the TensorFlow operator """
    with tf.variable_scope(name, default_name="pad_strided"):
        t_indices = tf.constant(indices, dtype=tf.int32, name="indices")
        t_output_shape = tf.constant(output_shape, name="output_shape")
        return tf.scatter_nd(t_indices, x, t_output_shape)


session = tf.Session()
batch_size = 1
time_series_length = 6
num_filters = 3
t_in = tf.random.uniform((batch_size, time_series_length, num_filters))
# Specify a stride 2 for the time_series dimension
t_out = pad_strided(t_in, strides=[1, 2, 1])
original, strided = session.run([t_in, t_out])
print(f"Input Tensor:\n{original[:,:,:]}")
print(f"Output Tensor:\n{strided[:,:,:]}")

输出将例如

Input Tensor:
[[[0.0678339  0.07883668 0.49193358]
  [0.5029118  0.8639555  0.74302936]
  [0.995087   0.6315181  0.11990702]
  [0.95606446 0.29059124 0.12656784]
  [0.8278991  0.8518325  0.4033165 ]
  [0.78434443 0.7894305  0.6251142 ]]]
Output Tensor:
[[[0.0678339  0.07883668 0.49193358]
  [0.         0.         0.        ]
  [0.5029118  0.8639555  0.74302936]
  [0.         0.         0.        ]
  [0.995087   0.6315181  0.11990702]
  [0.         0.         0.        ]
  [0.95606446 0.29059124 0.12656784]
  [0.         0.         0.        ]
  [0.8278991  0.8518325  0.4033165 ]
  [0.         0.         0.        ]
  [0.78434443 0.7894305  0.6251142 ]
  [0.         0.         0.        ]]]

答案 1 :(得分:0)

我正在研究图像的类似问题。我想从batch, height, width, in_channels转到batch, 2*height, 2*width, in_channels。就像您说的那样,这非常像转置的卷积,所以我最终将tf.nn.conv2d_transposestrides=2filters=tf.ones([1, 1, 1, 1])一起使用:

upsampled_images = tf.nn.conv2d_transpose(images, tf.ones([1, 1, 1, 1]), output_shape, strides=2, padding='VALID')

这很好用,所以我认为只要将tf.nn.conv1d_transposefilters=tf.ones([1, 1, 1])一起使用,一维也将适用。

我知道这个问题很旧,您可能已经想出了一个办法,但是我本人一直在寻找答案,所以它可能会对其他人有所帮助。