Question

我已经训练了一个超分辨率模型。推论起作用：

def inference(self, filename):
    img_in = misc.imread(filename) / 255.0
    res = self.sess.run(self.out, feed_dict={ self.x: [img_in], self.is_training: False, self.lr_input: 0.0 })[0] * 255.0
    image = misc.toimage(res, cmin=0, cmax=255)
    fn_res = filename.replace(".png", "_result.png").replace(".jpg", "_result.jpg")
    misc.imsave(fn_res, image)

但是当图片大于600x600px时，它会说：

terminate called after throwing an instance of 'std::bad_alloc'
what():  std::bad_alloc
Aborted (core dumped)

我正在使用16GB RAM和16GB交换文件，使用CPU进行推断。但是尽管足够，但在Ubuntu 16.04中设置的交换文件却无济于事。

我知道我可以通过将图像分成多个部分并按顺序处理它们来解决此问题。但是在输出中有明显的分裂痕迹。

为什么不起作用？有什么方法可以使用Python或Tensorflow实现内存交换文件吗？还有其他解决方法吗？

Answer 1

通过将图像拆分为多个片段，对其进行逐一处理并将其组合来解决。为了避免在输出图像中出现可见的“分割线”，我以不同的片段大小重复上述过程并计算平均值。

我最终得到了这个

def inference(self, filename, self_ensemble=False):
    img_q = Image.open(filename)

    scale = 4

    if img_q.width > 400:
        total = np.zeros((img_q.height * scale, img_q.width * scale, 3))
        #repeat 7 times with different piece size
        for x in range(7):
            total += self.inference_inter(filename, scale, 32 * (x + 8), ensemble=self_ensemble)

        total = total / 7.0

        image = misc.toimage(total, cmin=0, cmax=255)
        fn_res = filename.replace(".png", "_result.png").replace(".jpg", "_result.jpg")
        misc.imsave(fn_res, image)
    else:
        img_in = misc.imread(filename) / 255.0
        res = self.sess.run(self.out, feed_dict={ self.x: [img_in], self.is_training: False, self.lr_input: 0.0 })[0] * 255.0
        image = misc.toimage(res, cmin=0, cmax=255)
        fn_res = filename.replace(".png", "_result.png").replace(".jpg", "_result.jpg")
        misc.imsave(fn_res, image)


def inference_inter(self, filename, scale, pat_size, ensemble=True):
    img_in = misc.imread(filename) / 255.0
    img_q = Image.open(filename)

    res = np.zeros((img_q.height * scale, img_q.width * scale, 3))

    for qx in range(0, img_q.width, pat_size):
        for qy in range(0, img_q.height, pat_size):
            res[(qy * scale):((qy + pat_size) * scale)][(qx * scale):((qx + pat_size) * scale)] = self.sess.run(self.out, feed_dict={ self.x: [img_in[qy:(qy + pat_size)][qx:(qx + pat_size)]], self.is_training: False, self.lr_input: 0.0 })[0] * 255.0

    return res

Answer 2

本文解决了您的问题，并说明了其存在的原因：

https://dl.acm.org/citation.cfm?id=3132847.3132872

如果您无法访问它，请参见以下引用/摘要：

有两个问题：填充和规范化。

填充：

进行卷积时，网络需要在图像的边界上填写丢失的像素值。这是因为前两个边界像素的内核位于所提供图像的外部。

规范化：

在卷积网络中，采用批归一化技术来加快收敛速度并减少内部协变量偏移。

现在，假设我们将图像分为两个子图像。我们有两个小批量。如果它们独立地经过CNN，我们将获得两个归一化结果，这些结果仅将子图像的局部均值和方差应用于计算中。因此，即使将这些片段合并回一张图像，我们最终得到的结果在视觉上也不同于为整个图像生成的结果。

他们提出以下解决方案：

填充：

对于内部边框，我们使用一种新颖的填充技术，即字典填充。

这意味着它们使用相邻子图像的相应像素值。

对于外部边框，我们使用圆形填充...

这意味着您使用图像左侧的像素在右侧填充，就好像您将图像串联在右侧（反之亦然）。（与顶部相同）

规范化：

这有点复杂。最后，它们对子图像进行归一化，并共享图像之间的归一化值（均值和方差），并将它们组合在一起。

填充可能很容易实现，但我不知道如何使用张量流进行标准化。如果有人找出来并发送一些代码，我将不胜感激。

Tensorflow，大图像推断-内存不足

2 个答案:

有两个问题：填充和规范化。

填充：

规范化：

他们提出以下解决方案：

填充：

规范化：