Question

我编写了以下代码，该代码采用形状为(n_channels, img_height, img_width)的feature_map张量和形状为(5,)（其元素为(1, xmin, ymin, xmax, ymax)的感兴趣区域）并返回每个元素的最大元素在指定区域内的频道：

def tf_box_pool(self, feature_map, roi):
        ''' Extracts region of interest from feature map
        '''

        # Compute the scaled down region of interest
        self.spatial_scale = K.cast(self.spatial_scale, 'float32')

        roi_start_w = tf.math.scalar_mul(self.spatial_scale, tf.cast(roi[1], 'float32'))
        roi_start_h = tf.math.scalar_mul(self.spatial_scale, tf.cast(roi[2], 'float32'))
        roi_end_w = tf.math.scalar_mul(self.spatial_scale, tf.cast(roi[3], 'float32'))
        roi_end_h = tf.math.scalar_mul(self.spatial_scale, tf.cast(roi[4], 'float32'))

        roi_height = tf.math.round(tf.math.maximum(roi_end_h - roi_start_h + 1, 1))
        roi_width = tf.math.round(tf.math.maximum(roi_end_w - roi_start_w + 1, 1))

        h_start = K.cast(roi_start_h, 'int32')
        height = K.cast(roi_height, 'int32')
        h_end = h_start + height
        w_start = K.cast(roi_start_w, 'int32')
        width = K.cast(roi_width, 'int32')
        w_end = w_start + width     

        mapped_region = feature_map[:, h_start:h_end, w_start:w_end]

        pooled_features = tf.math.reduce_max(mapped_region, axis=[1,2])

        return pooled_features

现在，我有两个张量，分别代表一批图像和一批感兴趣区域的列表，它们的形状分别为(batch_size, n_channels, img_width, img_height)和(batch_size, n_rois, 5)。

我想将n_rois次以上的函数应用于第一张量中的每个图像，并将其与第二张量中的每个感兴趣区域一起馈入。最终结果应该是形状为(batch_size, n_rois, n_channels)

的张量

我该怎么做？

在张量流中实现ROI池

0 个答案: