Question

我有一个感兴趣的区域的图像。我想对这个图像应用随机变换，同时保持感兴趣的区域正确。

我的代码采用此格式[x_min, y_min, x_max, y_max]的方框列表。然后，它将框转换为每个框的顶点列表[up_left, up_right, down_right, down_left]。这是一个向量列表。所以我可以将变换应用于向量。

下一步是在转换顶点列表中查找新的[x_min, y_min, x_max, y_max]。

我的第一个申请是轮换，他们工作正常：

这是相应的代码。第一部分取自keras代码库，向下滚动到新代码注释。如果我让代码工作，我会有兴趣将它集成到keras中。所以我正在尝试将我的代码集成到他们的图像预处理基础架构中：

def random_rotation_with_boxes(x, boxes, rg, row_axis=1, col_axis=2, channel_axis=0,
                    fill_mode='nearest', cval=0.):
    """Performs a random rotation of a Numpy image tensor. 
       Also rotates the corresponding bounding boxes

    # Arguments
        x: Input tensor. Must be 3D.
        boxes: a list of bounding boxes [xmin, ymin, xmax, ymax], values in [0,1].
        rg: Rotation range, in degrees.
        row_axis: Index of axis for rows in the input tensor.
        col_axis: Index of axis for columns in the input tensor.
        channel_axis: Index of axis for channels in the input tensor.
        fill_mode: Points outside the boundaries of the input
            are filled according to the given mode
            (one of `{'constant', 'nearest', 'reflect', 'wrap'}`).
        cval: Value used for points outside the boundaries
            of the input if `mode='constant'`.

    # Returns
        Rotated Numpy image tensor.
        And rotated bounding boxes
    """

    # sample parameter for augmentation
    theta = np.pi / 180 * np.random.uniform(-rg, rg)

    # apply to image
    rotation_matrix = np.array([[np.cos(theta), -np.sin(theta), 0],
                                [np.sin(theta), np.cos(theta), 0],
                                [0, 0, 1]])

    h, w = x.shape[row_axis], x.shape[col_axis]
    transform_matrix = transform_matrix_offset_center(rotation_matrix, h, w)
    x = apply_transform(x, transform_matrix, channel_axis, fill_mode, cval)


    # -------------------------------------------------
    # NEW CODE FROM HERE
    # -------------------------------------------------
    # apply to vertices
    vertices = boxes_to_vertices(boxes)
    vertices = vertices.reshape((-1, 2))

    # apply offset to have pivot point at [0.5, 0.5]
    vertices -= [0.5, 0.5]

    # apply rotation, we only need the rotation part of the matrix
    vertices = np.dot(vertices, rotation_matrix[:2, :2])
    vertices += [0.5, 0.5]

    boxes = vertices_to_boxes(vertices)

    return x, boxes, vertices

如您所见，他们正在使用scipy.ndimage将转换应用于图片。

我的边界框的坐标位于[0,1]，中间位置为[0.5, 0.5]。需要围绕[0.5, 0.5]作为枢轴点应用旋转。可以使用同质坐标和矩阵来移动，旋转和移动矢量。这就是他们为图像所做的事情。有一个现有的transform_matrix_offset_center函数，但该函数偏移到float(width)/2 + 0.5。 +0.5使其不适合[0, 1]中的坐标。所以我自己转移了这些载体。

对于旋转，此代码可以正常工作。我认为这通常是适用的。

但是对于缩放，这会以一种奇怪的方式失败。代码非常相似：

vertices -= [0.5, 0.5]

# apply zoom, we only need the zoom part of the matrix
vertices = np.dot(vertices, zoom_matrix[:2, :2])
vertices += [0.5, 0.5]

输出是这样的：

似乎存在各种问题：

转移被打破了。在图像1中，ROI和对应的图像部分几乎不重叠
坐标似乎已切换。在图像2中，ROI和图像似乎沿x轴和y轴不同地缩放。

我尝试使用(zoom_matrix[:2, :2].T)[::-1, ::-1]切换轴。这导致了这个问题：

现在比例因子被打破了吗？我已经在这个矩阵乘法，转置，镜像，改变比例因子等方面尝试过很多不同的变化。我似乎无法做到这一点。

无论如何，我认为原始代码应该是正确的。毕竟，它适用于旋转。在这一点上，我在想是否这是scipy的ndimage重采样的特殊性？

这是我的数学错误，还是缺少真正模仿scipy ndimage重采样的东西？

我已将完整的源代码放在pastebin上。我只更新了小部件，实际上这是来自keras的代码： https://pastebin.com/tsHnLLgy

使用新增强功能并创建这些图像的代码如下： https://nbviewer.jupyter.org/gist/lhk/b8f30e9f30c5d395b99188a53524c53e

更新

如果缩放因子被反转，则转换起作用。对于缩放，此操作很简单，可表示为：

# vertices is an array of shape [number of vertices, 2]
vertices *= [1/zx, 1/zy]

这对应于将逆变换应用于顶点。在图像重采样的背景下，这可能是有意义的。可以像这样重新采样图像

为每个像素创建一个坐标向量。
将逆变换应用于每个向量
插入原始图像以找到矢量现在指向的值
将此值写入原始位置的输出图像

但是对于旋转，我没有反转矩阵并且操作正常。

问题本身，如何解决这个问题，似乎得到了回答。但我不明白为什么。

keras中的ROI扩充：scipy.ndimage转换

0 个答案: