我正在使用神经网络进行超分辨率(提高图像的分辨率)。但是,由于图像可能很大,因此我需要将其分割成多个较小的图像,并在将结果重新合并在一起之前分别对每个图像进行预测。
以下是给我的例子:
示例1 :您会在输出图片中看到一条细微的垂直线穿过滑雪者的肩膀。
示例2 :一旦开始看到它们,您会注意到细微的线条在整个图像中形成了正方形(这是我对图像进行分割以进行单个预测的残余形式)。>
示例3 :您可以清楚地看到与湖相交的垂直线。
基本上,我的网络对边缘的预测不佳,我认为这是正常现象,因为“周围”信息较少。
import numpy as np
import matplotlib.pyplot as plt
import skimage.io
from keras.models import load_model
from constants import verbosity, save_dir, overlap, \
model_name, tests_path, input_width, input_height
from utils import float_im
def predict(args):
model = load_model(save_dir + '/' + args.model)
image = skimage.io.imread(tests_path + args.image)[:, :, :3] # removing possible extra channels (Alpha)
print("Image shape:", image.shape)
predictions = []
images = []
crops = seq_crop(image) # crops into multiple sub-parts the image based on 'input_' constants
for i in range(len(crops)): # amount of vertical crops
for j in range(len(crops[0])): # amount of horizontal crops
current_image = crops[i][j]
images.append(current_image)
print("Moving on to predictions. Amount:", len(images))
for p in range(len(images)):
if p%3 == 0 and verbosity == 2:
print("--prediction #", p)
# Hack because GPU can only handle one image at a time
input_img = (np.expand_dims(images[p], 0)) # Add the image to a batch where it's the only member
predictions.append(model.predict(input_img)[0]) # returns a list of lists, one for each image in the batch
return predictions, image, crops
def show_pred_output(input, pred):
plt.figure(figsize=(20, 20))
plt.suptitle("Results")
plt.subplot(1, 2, 1)
plt.title("Input : " + str(input.shape[1]) + "x" + str(input.shape[0]))
plt.imshow(input, cmap=plt.cm.binary).axes.get_xaxis().set_visible(False)
plt.subplot(1, 2, 2)
plt.title("Output : " + str(pred.shape[1]) + "x" + str(pred.shape[0]))
plt.imshow(pred, cmap=plt.cm.binary).axes.get_xaxis().set_visible(False)
plt.show()
# adapted from https://stackoverflow.com/a/52463034/9768291
def seq_crop(img):
"""
To crop the whole image in a list of sub-images of the same size.
Size comes from "input_" variables in the 'constants' (Evaluation).
Padding with 0 the Bottom and Right image.
:param img: input image
:return: list of sub-images with defined size
"""
width_shape = ceildiv(img.shape[1], input_width)
height_shape = ceildiv(img.shape[0], input_height)
sub_images = [] # will contain all the cropped sub-parts of the image
for j in range(height_shape):
horizontal = []
for i in range(width_shape):
horizontal.append(crop_precise(img, i*input_width, j*input_height, input_width, input_height))
sub_images.append(horizontal)
return sub_images
def crop_precise(img, coord_x, coord_y, width_length, height_length):
"""
To crop a precise portion of an image.
When trying to crop outside of the boundaries, the input to padded with zeros.
:param img: image to crop
:param coord_x: width coordinate (top left point)
:param coord_y: height coordinate (top left point)
:param width_length: width of the cropped portion starting from coord_x
:param height_length: height of the cropped portion starting from coord_y
:return: the cropped part of the image
"""
tmp_img = img[coord_y:coord_y + height_length, coord_x:coord_x + width_length]
return float_im(tmp_img) # From [0,255] to [0.,1.]
# from https://stackoverflow.com/a/17511341/9768291
def ceildiv(a, b):
return -(-a // b)
# adapted from https://stackoverflow.com/a/52733370/9768291
def reconstruct(predictions, crops):
# unflatten predictions
def nest(data, template):
data = iter(data)
return [[next(data) for _ in row] for row in template]
if len(crops) != 0:
predictions = nest(predictions, crops)
H = np.cumsum([x[0].shape[0] for x in predictions])
W = np.cumsum([x.shape[1] for x in predictions[0]])
D = predictions[0][0]
recon = np.empty((H[-1], W[-1], D.shape[2]), D.dtype)
for rd, rs in zip(np.split(recon, H[:-1], 0), predictions):
for d, s in zip(np.split(rd, W[:-1], 1), rs):
d[...] = s
return recon
if __name__ == '__main__':
print(" - ", args)
preds, original, crops = predict(args) # returns the predictions along with the original
enhanced = reconstruct(preds, crops) # reconstructs the enhanced image from predictions
plt.imsave('output/' + args.save, enhanced, cmap=plt.cm.gray)
show_pred_output(original, enhanced)
有很多明显的幼稚方法可以解决此问题,但是我坚信必须有一种非常简洁的方法来解决此问题:如何添加overlap_amount
变量,使我能够做出重叠的预测,从而丢弃每个子图像的“边缘部分”(“段”),并用围绕其的片段的预测结果替换它(因为它们将不包含“边缘预测”)?
答案 0 :(得分:1)
我通过将推理移入CPU解决了类似的问题。速度要慢得多,但是至少在我看来,这比我也测试过的基于重叠ROI投票或丢弃的方法更好地解决了补丁边界问题。
假设您正在使用Tensorflow后端:
from tensorflow.python import device
with device('cpu:0')
prediction = model.predict(...)
当然要假设您有足够的RAM来适合您的模型。如果不是这种情况,请在下面评论,我将检查代码中是否有可用于此处的内容。
答案 1 :(得分:0)
通过幼稚的方法解决了它。可能会 很多 ,但这至少可以奏效。
基本上,它会获取初始图像,然后在其周围添加填充,然后将其裁剪为多个子图像,这些子图像全部排成一个阵列。进行裁剪后,所有图像也将与其周围的邻居重叠。
然后,将每张图像馈入网络并收集预测(在这种情况下,基本上是图像分辨率的4倍)。重建图像时,每个预测都是单独进行的,并且其边缘会被裁剪掉(因为它包含错误)。进行裁剪后,所有预测的胶合最终不会重叠,并且只有来自神经网络的预测的中间部分才粘在一起。
最后,除去周围的填充物。
没有更多的电话! :D
import numpy as np
import matplotlib.pyplot as plt
import skimage.io
from keras.models import load_model
from constants import verbosity, save_dir, overlap, \
model_name, tests_path, input_width, input_height, scale_fact
from utils import float_im
def predict(args):
"""
Super-resolution on the input image using the model.
:param args:
:return:
'predictions' contains an array of every single cropped sub-image once enhanced (the outputs of the model).
'image' is the original image, untouched.
'crops' is the array of every single cropped sub-image that will be used as input to the model.
"""
model = load_model(save_dir + '/' + args.model)
image = skimage.io.imread(tests_path + args.image)[:, :, :3] # removing possible extra channels (Alpha)
print("Image shape:", image.shape)
predictions = []
images = []
# Padding and cropping the image
overlap_pad = (overlap, overlap) # padding tuple
pad_width = (overlap_pad, overlap_pad, (0, 0)) # assumes color channel as last
padded_image = np.pad(image, pad_width, 'constant') # padding the border
crops = seq_crop(padded_image) # crops into multiple sub-parts the image based on 'input_' constants
# Arranging the divided image into a single-dimension array of sub-images
for i in range(len(crops)): # amount of vertical crops
for j in range(len(crops[0])): # amount of horizontal crops
current_image = crops[i][j]
images.append(current_image)
print("Moving on to predictions. Amount:", len(images))
upscaled_overlap = overlap * 2
for p in range(len(images)):
if p % 3 == 0 and verbosity == 2:
print("--prediction #", p)
# Hack due to some GPUs that can only handle one image at a time
input_img = (np.expand_dims(images[p], 0)) # Add the image to a batch where it's the only member
pred = model.predict(input_img)[0] # returns a list of lists, one for each image in the batch
# Cropping the useless parts of the overlapped predictions (to prevent the repeated erroneous edge-prediction)
pred = pred[upscaled_overlap:pred.shape[0]-upscaled_overlap, upscaled_overlap:pred.shape[1]-upscaled_overlap]
predictions.append(pred)
return predictions, image, crops
def show_pred_output(input, pred):
plt.figure(figsize=(20, 20))
plt.suptitle("Results")
plt.subplot(1, 2, 1)
plt.title("Input : " + str(input.shape[1]) + "x" + str(input.shape[0]))
plt.imshow(input, cmap=plt.cm.binary).axes.get_xaxis().set_visible(False)
plt.subplot(1, 2, 2)
plt.title("Output : " + str(pred.shape[1]) + "x" + str(pred.shape[0]))
plt.imshow(pred, cmap=plt.cm.binary).axes.get_xaxis().set_visible(False)
plt.show()
# adapted from https://stackoverflow.com/a/52463034/9768291
def seq_crop(img):
"""
To crop the whole image in a list of sub-images of the same size.
Size comes from "input_" variables in the 'constants' (Evaluation).
Padding with 0 the Bottom and Right image.
:param img: input image
:return: list of sub-images with defined size (as per 'constants')
"""
sub_images = [] # will contain all the cropped sub-parts of the image
j, shifted_height = 0, 0
while shifted_height < (img.shape[0] - input_height):
horizontal = []
shifted_height = j * (input_height - overlap)
i, shifted_width = 0, 0
while shifted_width < (img.shape[1] - input_width):
shifted_width = i * (input_width - overlap)
horizontal.append(crop_precise(img,
shifted_width,
shifted_height,
input_width,
input_height))
i += 1
sub_images.append(horizontal)
j += 1
return sub_images
def crop_precise(img, coord_x, coord_y, width_length, height_length):
"""
To crop a precise portion of an image.
When trying to crop outside of the boundaries, the input to padded with zeros.
:param img: image to crop
:param coord_x: width coordinate (top left point)
:param coord_y: height coordinate (top left point)
:param width_length: width of the cropped portion starting from coord_x (toward right)
:param height_length: height of the cropped portion starting from coord_y (toward bottom)
:return: the cropped part of the image
"""
tmp_img = img[coord_y:coord_y + height_length, coord_x:coord_x + width_length]
return float_im(tmp_img) # From [0,255] to [0.,1.]
# adapted from https://stackoverflow.com/a/52733370/9768291
def reconstruct(predictions, crops):
"""
Used to reconstruct a whole image from an array of mini-predictions.
The image had to be split in sub-images because the GPU's memory
couldn't handle the prediction on a whole image.
:param predictions: an array of upsampled images, from left to right, top to bottom.
:param crops: 2D array of the cropped images
:return: the reconstructed image as a whole
"""
# unflatten predictions
def nest(data, template):
data = iter(data)
return [[next(data) for _ in row] for row in template]
if len(crops) != 0:
predictions = nest(predictions, crops)
# At this point "predictions" is a 3D image of the individual outputs
H = np.cumsum([x[0].shape[0] for x in predictions])
W = np.cumsum([x.shape[1] for x in predictions[0]])
D = predictions[0][0]
recon = np.empty((H[-1], W[-1], D.shape[2]), D.dtype)
for rd, rs in zip(np.split(recon, H[:-1], 0), predictions):
for d, s in zip(np.split(rd, W[:-1], 1), rs):
d[...] = s
# Removing the pad from the reconstruction
tmp_overlap = overlap * (scale_fact - 1) # using "-2" leaves the outer edge-prediction error
return recon[tmp_overlap:recon.shape[0]-tmp_overlap, tmp_overlap:recon.shape[1]-tmp_overlap]
if __name__ == '__main__':
print(" - ", args)
preds, original, crops = predict(args) # returns the predictions along with the original
enhanced = reconstruct(preds, crops) # reconstructs the enhanced image from predictions
# Save and display the result
plt.imsave('output/' + args.save, enhanced, cmap=plt.cm.gray)
show_pred_output(original, enhanced)
verbosity = 2
input_width = 64
input_height = 64
overlap = 16
scale_fact = 4
def float_im(img):
return np.divide(img, 255.)
一个possibly better alternative,如果您遇到与我相同的问题,可能要考虑一下;这是相同的基本思想,但更加完善和完善。