Tensorflow:将输入中的随机ROI设置为0(如丢失)

时间:2018-03-07 11:08:22

标签: python tensorflow

我想通过将随机大小和位置设置为零的方块来稀疏我的输入数据。即使输入数据稀疏,我的网络也应学会预测像素输出(例如深度或语义分割)。在numpy我会做以下事情:

# create random input data
data = np.random.uniform(0, 1, [4, 4])

# initialize mask
mask = np.ones_like(data, dtype=np.bool)

# get random position and size of dropout square
np.random.seed(42)  # just for this example
rand_pos_x = int(np.random.uniform(0, data.shape[0], 1))
rand_pos_y = int(np.random.uniform(0, data.shape[1], 1))
rand_size = int(np.random.uniform(0.2 * data.shape[0], 0.8 * data.shape[0], 1))

# compute upper left and lower right corners in image coordinates
x1 = int(max(rand_pos_x - np.floor(rand_size / 2), 0))
y1 = int(max(rand_pos_y - np.floor(rand_size / 2), 0))
x2 = int(min(rand_pos_x + np.ceil(rand_size / 2), data.shape[0]))
y2 = int(min(rand_pos_y + np.ceil(rand_size / 2), data.shape[1]))

# set values in input data in random square to zero
mask[x1:x2, y1:y2] = False  # <-- how to do this in tf?
network_input = np.where(mask, data, 0)

print(rand_pos_x, rand_pos_y, rand_size)
print(data)
print(mask)
print(network_input)

输出:

1 3 2
[[0.48526512 0.69295915 0.0659424  0.96775734]
 [0.29714754 0.82867678 0.24399012 0.40785638]
 [0.44178606 0.71495478 0.55438262 0.64918671]
 [0.72574993 0.44672654 0.06619564 0.43418488]]
[[ True  True False False]
 [ True  True False False]
 [ True  True  True  True]
 [ True  True  True  True]]
[[0.48526512 0.69295915 0.         0.        ]
 [0.29714754 0.82867678 0.         0.        ]
 [0.44178606 0.71495478 0.55438262 0.64918671]
 [0.72574993 0.44672654 0.06619564 0.43418488]]

我不知道如何转换为适当的张量流代码的行是

mask[x1:x2, y1:y2] = False

可用于此的功能可能是tf.assign。 “脏”变体是使用tf.stack将伪矩形与真实矩形拼接在一起。

由于

1 个答案:

答案 0 :(得分:0)

我不会使用变量,因为变量通常被保留用于跨不同run次调用存储数据。通常的方法是使用修改后的数据创建新的张量。这是一种做你想做的事情的方法:

# Generate data
data = tf.random_uniform((4, 6), 0, 1)
shape = tf.shape(data)
shape_x, shape_y = shape[0], shape[1]
# Pick random window
rand_pos_x = tf.random_uniform((), maxval=shape_x, dtype=tf.int32)
rand_pos_y = tf.random_uniform((), maxval=shape_y, dtype=tf.int32)
min_size = tf.cast(tf.ceil(0.2 * tf.cast(shape_x, dtype=tf.float32)), dtype=tf.int32)
max_size = tf.cast(0.8 * tf.cast(shape_x, dtype=tf.float32), dtype=tf.int32)
rand_size = tf.random_uniform((), minval=min_size, maxval=max_size, dtype=tf.int32)
#rand_size = tf.random_uniform((), minval=tf.cast(0.2 * shape_x, dtype=tf.int32),
#                              maxval=tf.cast(0.8 * shape_x, dtype=tf.int32),
#                              dtype=tf.int32)
rand_size_h1 = rand_size // 2
rand_size_h2 = rand_size - rand_size_h1
# max and min ops not needed with this method
x1 = rand_pos_x - rand_size_h1
y1 = rand_pos_y - rand_size_h1
x2 = rand_pos_x + rand_size_h2
y2 = rand_pos_y + rand_size_h2
# Make mask
x_range = tf.range(shape_x)[:, tf.newaxis]  # [[0, 1, ..., shape_x]]
y_range = tf.range(shape_y)[tf.newaxis, :]  # [[0], [1], ..., [shape_y]]
mask = (x1 <= x_range) & (x_range < x2) & (y1 <= y_range) & (y_range < y2)
# Mask image
res = tf.where(mask, tf.zeros_like(data), data)
res_val, win_pos = sess.run([res, ((x1, y1), (x2, y2))])
print('Window position: {}'.format(win_pos))
print(res_val)

输出:

Window position: ((0, 3), (2, 5))
[[ 0.7064023   0.25481129  0.81838679  0.          0.          0.22503126]
 [ 0.81459022  0.97437024  0.60148883  0.          0.          0.54877841]
 [ 0.9122386   0.69985616  0.39421701  0.01544178  0.84893608  0.69215906]
 [ 0.94896781  0.33128083  0.23551905  0.62681305  0.61286592  0.83353639]]