我想通过将随机大小和位置设置为零的方块来稀疏我的输入数据。即使输入数据稀疏,我的网络也应学会预测像素输出(例如深度或语义分割)。在numpy我会做以下事情:
# create random input data
data = np.random.uniform(0, 1, [4, 4])
# initialize mask
mask = np.ones_like(data, dtype=np.bool)
# get random position and size of dropout square
np.random.seed(42) # just for this example
rand_pos_x = int(np.random.uniform(0, data.shape[0], 1))
rand_pos_y = int(np.random.uniform(0, data.shape[1], 1))
rand_size = int(np.random.uniform(0.2 * data.shape[0], 0.8 * data.shape[0], 1))
# compute upper left and lower right corners in image coordinates
x1 = int(max(rand_pos_x - np.floor(rand_size / 2), 0))
y1 = int(max(rand_pos_y - np.floor(rand_size / 2), 0))
x2 = int(min(rand_pos_x + np.ceil(rand_size / 2), data.shape[0]))
y2 = int(min(rand_pos_y + np.ceil(rand_size / 2), data.shape[1]))
# set values in input data in random square to zero
mask[x1:x2, y1:y2] = False # <-- how to do this in tf?
network_input = np.where(mask, data, 0)
print(rand_pos_x, rand_pos_y, rand_size)
print(data)
print(mask)
print(network_input)
输出:
1 3 2
[[0.48526512 0.69295915 0.0659424 0.96775734]
[0.29714754 0.82867678 0.24399012 0.40785638]
[0.44178606 0.71495478 0.55438262 0.64918671]
[0.72574993 0.44672654 0.06619564 0.43418488]]
[[ True True False False]
[ True True False False]
[ True True True True]
[ True True True True]]
[[0.48526512 0.69295915 0. 0. ]
[0.29714754 0.82867678 0. 0. ]
[0.44178606 0.71495478 0.55438262 0.64918671]
[0.72574993 0.44672654 0.06619564 0.43418488]]
我不知道如何转换为适当的张量流代码的行是
mask[x1:x2, y1:y2] = False
可用于此的功能可能是tf.assign
。 “脏”变体是使用tf.stack
将伪矩形与真实矩形拼接在一起。
由于
答案 0 :(得分:0)
我不会使用变量,因为变量通常被保留用于跨不同run
次调用存储数据。通常的方法是使用修改后的数据创建新的张量。这是一种做你想做的事情的方法:
# Generate data
data = tf.random_uniform((4, 6), 0, 1)
shape = tf.shape(data)
shape_x, shape_y = shape[0], shape[1]
# Pick random window
rand_pos_x = tf.random_uniform((), maxval=shape_x, dtype=tf.int32)
rand_pos_y = tf.random_uniform((), maxval=shape_y, dtype=tf.int32)
min_size = tf.cast(tf.ceil(0.2 * tf.cast(shape_x, dtype=tf.float32)), dtype=tf.int32)
max_size = tf.cast(0.8 * tf.cast(shape_x, dtype=tf.float32), dtype=tf.int32)
rand_size = tf.random_uniform((), minval=min_size, maxval=max_size, dtype=tf.int32)
#rand_size = tf.random_uniform((), minval=tf.cast(0.2 * shape_x, dtype=tf.int32),
# maxval=tf.cast(0.8 * shape_x, dtype=tf.int32),
# dtype=tf.int32)
rand_size_h1 = rand_size // 2
rand_size_h2 = rand_size - rand_size_h1
# max and min ops not needed with this method
x1 = rand_pos_x - rand_size_h1
y1 = rand_pos_y - rand_size_h1
x2 = rand_pos_x + rand_size_h2
y2 = rand_pos_y + rand_size_h2
# Make mask
x_range = tf.range(shape_x)[:, tf.newaxis] # [[0, 1, ..., shape_x]]
y_range = tf.range(shape_y)[tf.newaxis, :] # [[0], [1], ..., [shape_y]]
mask = (x1 <= x_range) & (x_range < x2) & (y1 <= y_range) & (y_range < y2)
# Mask image
res = tf.where(mask, tf.zeros_like(data), data)
res_val, win_pos = sess.run([res, ((x1, y1), (x2, y2))])
print('Window position: {}'.format(win_pos))
print(res_val)
输出:
Window position: ((0, 3), (2, 5))
[[ 0.7064023 0.25481129 0.81838679 0. 0. 0.22503126]
[ 0.81459022 0.97437024 0.60148883 0. 0. 0.54877841]
[ 0.9122386 0.69985616 0.39421701 0.01544178 0.84893608 0.69215906]
[ 0.94896781 0.33128083 0.23551905 0.62681305 0.61286592 0.83353639]]