我正在尝试从多个矩形区域内的2d张量提取值。我想裁剪矩形区域,同时将框外的所有值设置为零。
例如,从9 x 9图像中,我想获得两个单独的图像,其值在两个矩形红色框中,同时将其余值设置为零。有没有一种方便的方法可以使用张量流切片来做到这一点?
我想到的一种方法是定义一个掩码数组,该掩码数组在框内为1,在框外为0,并将其与输入数组相乘。但是,这需要遍历框的数量,每次更改将掩码的哪些值设置为0时,是否有更快,更有效的方法来执行此操作而不使用for循环?张量流中是否有等效的裁剪和替换功能?这是我使用for循环的代码。感谢对此的任何投入。谢谢
import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib.patches as patches
tf.reset_default_graph()
size = 9 # size of input image
num_boxes = 2 # number of rectangular boxes
def get_cutout(X, bboxs):
"""Returns copies of X with values only inside bboxs"""
out = []
for i in range(num_boxes):
bbox = bboxs[i] # get rectangular box coordinates
Y = tf.Variable(np.zeros((size, size)), dtype=tf.float32) # define temporary mask
# set values of mask inside box to 1
t = [Y[bbox[0]:bbox[2], bbox[2]:bbox[3]].assign(
tf.ones((bbox[2]-bbox[0], bbox[3]-bbox[2])))]
with tf.control_dependencies(t):
mask = tf.identity(Y)
out.append(X * mask) # get values inside rectangular box
return out, X
#define a 9x9 input image X and convert to tensor
in_x = np.eye(size)
in_x[0:3]=np.random.rand(3,9)
X = tf.constant(in_x , dtype=tf.float32)
bboxs = tf.placeholder(tf.int32, [None, 4]) # placeholder for rectangular box
X_outs = get_cutout(X, bboxs)
# coordintes of box ((bottom left x, bottom left y, top right x, top right y))
in_bbox = [[1,3,3,6], [4,3,7,8]]
feed_dict = {bboxs: in_bbox}
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
x_out= sess.run(X_outs, feed_dict=feed_dict)
# plot results
vmin = np.min(x_out[2])
vmax = np.max(x_out[2])
fig, ax = plt.subplots(nrows=1, ncols=1+len(in_bbox),figsize=(10,2))
im = ax[0].imshow(x_out[2], vmin=vmin, vmax=vmax, origin='lower')
plt.colorbar(im, ax=ax[0])
ax[0].set_title("input X")
for i, bbox in enumerate(in_bbox):
bottom_left = (bbox[2]-0.5, bbox[0]-0.5)
width = bbox[3]-bbox[2]
height = bbox[2]- bbox[0]
rect = patches.Rectangle(bottom_left, width, height,
linewidth=1,edgecolor='r',facecolor='none')
ax[0].add_patch(rect)
ax[i+1].set_title("extract values in box {}".format(i+1))
im = ax[i + 1].imshow(x_out[0][i], vmin=vmin, vmax=vmax, origin='lower')
plt.colorbar(im,ax=ax[i+1])
答案 0 :(得分:0)
可以使用tf.pad
创建掩码。
crop = tf.ones([3, 3])
# "before_axis_x" how many padding will be added before cropping zone over the axis x
# "after_axis_x" how many padding will be added after cropping zone over the axis x
mask = tf.pad(crop, [[before_axis_0, after_axis_0], [before_axis_1, after_axis_1]]
tf.mask(image, mask) # creates the extracted image
要具有与tf.image.crop_and_resize相同的行为,此函数将采用一组框并返回一组带有填充的提取图像。
def extract_with_padding(image, boxes):
"""
boxes: tensor of shape [num_boxes, 4].
boxes are the coordinates of the extracted part
box is an array [y1, x1, y2, x2]
where [y1, x1] (respectively [y2, x2]) are the coordinates
of the top left (respectively bottom right ) part of the image
image: tensor containing the initial image
"""
extracted = []
shape = tf.shape(image)
for b in boxes:
crop = tf.ones([3, 3])
mask = tf.pad(crop, [[b[0], shape[0] - b[2]], [b[1] , shape[1] - b[3]]])
extracted.append(tf.boolean_mask(image, mask))
return extracted
答案 1 :(得分:0)
感谢@edkevekeh这个非常好的功能。我必须对其稍加修改才能使其执行我想要的操作。一个,我无法遍历作为Tensor对象的盒子。另外,农作物的大小由方框决定,而不是3x3。另外,tf.boolean_mask返回裁切,但是我想保留裁切,但是在裁切之外将其替换为0。所以我用乘法替换了tf.boolean_mask。
对于我的用例,num_boxes可能很大,所以我想知道是否有比for循环更有效的方法,请不要猜测。如果有人需要,请修改@edkevekeh解决方案的版本。
def extract_with_padding(image, boxes):
"""
boxes: tensor of shape [num_boxes, 4].
boxes are the coordinates of the extracted part
box is an array [y1, x1, y2, x2]
where [y1, x1] (respectively [y2, x2]) are the coordinates
of the top left (respectively bottom right ) part of the image
image: tensor containing the initial image
"""
extracted = []
shape = tf.shape(image)
for i in range(boxes.shape[0]):
b = boxes[i]
crop = tf.ones([b[2] - b[0], b[3] - b[1]])
mask = tf.pad(crop, [[b[0], shape[0] - b[2]], [b[1] , shape[1] - b[3]]])
extracted.append(image*mask)
return extracted