我编写了以下代码,该代码采用形状为(n_channels, img_height, img_width)
的feature_map张量和形状为(5,)
(其元素为(1, xmin, ymin, xmax, ymax)
的感兴趣区域)并返回每个元素的最大元素在指定区域内的频道:
def tf_box_pool(self, feature_map, roi):
''' Extracts region of interest from feature map
'''
# Compute the scaled down region of interest
self.spatial_scale = K.cast(self.spatial_scale, 'float32')
roi_start_w = tf.math.scalar_mul(self.spatial_scale, tf.cast(roi[1], 'float32'))
roi_start_h = tf.math.scalar_mul(self.spatial_scale, tf.cast(roi[2], 'float32'))
roi_end_w = tf.math.scalar_mul(self.spatial_scale, tf.cast(roi[3], 'float32'))
roi_end_h = tf.math.scalar_mul(self.spatial_scale, tf.cast(roi[4], 'float32'))
roi_height = tf.math.round(tf.math.maximum(roi_end_h - roi_start_h + 1, 1))
roi_width = tf.math.round(tf.math.maximum(roi_end_w - roi_start_w + 1, 1))
h_start = K.cast(roi_start_h, 'int32')
height = K.cast(roi_height, 'int32')
h_end = h_start + height
w_start = K.cast(roi_start_w, 'int32')
width = K.cast(roi_width, 'int32')
w_end = w_start + width
mapped_region = feature_map[:, h_start:h_end, w_start:w_end]
pooled_features = tf.math.reduce_max(mapped_region, axis=[1,2])
return pooled_features
现在,我有两个张量,分别代表一批图像和一批感兴趣区域的列表,它们的形状分别为(batch_size, n_channels, img_width, img_height)
和(batch_size, n_rois, 5)
。
我想将n_rois次以上的函数应用于第一张量中的每个图像,并将其与第二张量中的每个感兴趣区域一起馈入。最终结果应该是形状为(batch_size, n_rois, n_channels)
我该怎么做?