应用ROI Align后如何计算要素图,如Mask RCNN Paper中所述?

时间:2018-04-14 03:15:41

标签: deep-learning mask roi

我正在浏览掩码RCNN给出here的幻灯片,但是在应用ROI Align之后无法计算出要素图,如下图所示,论文和幻灯片提到使用Bi - 线性插值,但我不知道在给定图像中如何做到这一点。感谢

RoIAlign (Mask R-CNN)

2 个答案:

答案 0 :(得分:2)

将4个点放置在每个池单元内后,将使用最接近它的4个像素使用双线性插值法确定每个点的值。一旦为每个点都有一个值,就可以取每个池单元中4个点的平均值或最大值。您将该值放入输出张量内的相应位置,可以进行正向操作,向后操作也不应该成为问题。

例如,在图像中,第一个红点被0.85、0.34、0.32和0.74值像素包围,结果值是以下函数:

  • 这些值

  • 红点到这些像素(其中心)的距离

距像素最近,其值距相应像素值最近。

答案 1 :(得分:0)

Selecting diagonal pixels as nearest point, image from Paper

Also check this implementation

#From Mask R-CNN paper: "We sample four regular locations, so
        # that we can evaluate either max or average pooling. In fact,
        # interpolating only a single value at each bin center (without
        # pooling) is nearly as effective."
        #
        # Here we use the simplified approach of a single value per bin,
        # which is how it's done in tf.crop_and_resize()
        # Result: [batch * num_boxes, pool_height, pool_width, channels]