我正在尝试使用numpy
在python中执行2d卷积我有一个2d数组,如下所示,行内核为H_r,列为H_c
data = np.zeros((nr, nc), dtype=np.float32)
#fill array with some data here then convolve
for r in range(nr):
data[r,:] = np.convolve(data[r,:], H_r, 'same')
for c in range(nc):
data[:,c] = np.convolve(data[:,c], H_c, 'same')
data = data.astype(np.uint8);
它不会产生我期望的输出,这段代码看起来不错,我认为问题在于从float32到8bit的转换。什么是最好的方法呢
由于
答案 0 :(得分:5)
由于你已经将内核分开了,你应该只使用scipy中的sepfir2d函数:
from scipy.signal import sepfir2d
convolved = sepfir2d(data, H_r, H_c)
另一方面,你在那里的代码看起来很好......
答案 1 :(得分:5)
@Tashus评论正确,因此@dudemeister's answer可能更多。通过避免直接的2D卷积和需要的操作次数,他建议的功能也更有效。
我相信你正在进行两次1d卷积,每列第一次,每行第二次,并用第二次的结果替换第一次的结果。
请注意,带有'same'
参数的numpy.convolve
返回的形状与提供的最大数组相同,因此当您进行第一次卷积时,您已经填充了整个data
数组。 / p>
在这些步骤中可视化数组的一个好方法是使用Hinton diagrams,这样您就可以检查哪些元素已经有值。
如果你的卷积矩阵是使用一个卷积矩阵的结果,你可以尝试添加两个卷积的结果(在第二个data[:,c] += ..
循环上使用data[:,c] =
而不是for
)维度H_r
和H_c
矩阵如此:
另一种方法是使用scipy.signal.convolve2d
和2d卷积数组,这可能是你想要做的第一个。
答案 2 :(得分:4)
也许它不是最优化的解决方案,但这是我之前用过Python的numpy库的一个实现:
def convolution2d(image, kernel, bias):
m, n = kernel.shape
if (m == n):
y, x = image.shape
y = y - m + 1
x = x - m + 1
new_image = np.zeros((y,x))
for i in range(y):
for j in range(x):
new_image[i][j] = np.sum(image[i:i+m, j:j+m]*kernel) + bias
return new_image
我希望这段代码可以帮助其他人同样怀疑。
问候。
答案 3 :(得分:1)
它可能也不是最优化的解决方案,但是它比@omotto提出的解决方案快大约十倍,并且它仅使用基本的numpy函数(如reshape,expand_dims,tile ...),并且没有“ for”循环:
def gen_idx_conv1d(in_size, ker_size):
"""
Generates a list of indices. This indices correspond to the indices
of a 1D input tensor on which we would like to apply a 1D convolution.
For instance, with a 1D input array of size 5 and a kernel of size 3, the
1D convolution product will successively looks at elements of indices [0,1,2],
[1,2,3] and [2,3,4] in the input array. In this case, the function idx_conv1d(5,3)
outputs the following array: array([0,1,2,1,2,3,2,3,4]).
args:
in_size: (type: int) size of the input 1d array.
ker_size: (type: int) kernel size.
return:
idx_list: (type: np.array) list of the successive indices of the 1D input array
access to the 1D convolution algorithm.
example:
>>> gen_idx_conv1d(in_size=5, ker_size=3)
array([0, 1, 2, 1, 2, 3, 2, 3, 4])
"""
f = lambda dim1, dim2, axis: np.reshape(np.tile(np.expand_dims(np.arange(dim1),axis),dim2),-1)
out_size = in_size-ker_size+1
return f(ker_size, out_size, 0)+f(out_size, ker_size, 1)
def repeat_idx_2d(idx_list, nbof_rep, axis):
"""
Repeats an array of indices (idx_list) a number of time (nbof_rep) "along" an axis
(axis). This function helps to browse through a 2d array of size
(len(idx_list),nbof_rep).
args:
idx_list: (type: np.array or list) a 1D array of indices.
nbof_rep: (type: int) number of repetition.
axis: (type: int) axis "along" which the repetition will be applied.
return
idx_list: (type: np.array) a 1D array of indices of size len(idx_list)*nbof_rep.
example:
>>> a = np.array([0, 1, 2])
>>> repeat_idx_2d(a, 3, 0) # repeats array 'a' 3 times along 'axis' 0
array([0, 0, 0, 1, 1, 1, 2, 2, 2])
>>> repeat_idx_2d(a, 3, 1) # repeats array 'a' 3 times along 'axis' 1
array([0, 1, 2, 0, 1, 2, 0, 1, 2])
>>> b = np.reshape(np.arange(3*4), (3,4))
>>> b[repeat_idx_2d(np.arange(3), 4, 0), repeat_idx_2d(np.arange(4), 3, 1)]
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
"""
assert axis in [0,1], "Axis should be equal to 0 or 1."
tile_axis = (nbof_rep,1) if axis else (1,nbof_rep)
return np.reshape(np.tile(np.expand_dims(idx_list, 1),tile_axis),-1)
def conv2d(im, ker):
"""
Performs a 'valid' 2D convolution on an image. The input image may be
a 2D or a 3D array.
The output image first two dimensions will be reduced depending on the
convolution size.
The kernel may be a 2D or 3D array. If 2D, it will be applied on every
channel of the input image. If 3D, its last dimension must match the
image one.
args:
im: (type: np.array) image (2D or 3D).
ker: (type: np.array) convolution kernel (2D or 3D).
returns:
im: (type: np.array) convolved image.
example:
>>> im = np.reshape(np.arange(10*10*3),(10,10,3))/(10*10*3) # 3D image
>>> ker = np.array([[0,1,0],[-1,0,1],[0,-1,0]]) # 2D kernel
>>> conv2d(im, ker) # 3D array of shape (8,8,3)
"""
if len(im.shape)==2: # it the image is a 2D array, it is reshaped by expanding the last dimension
im = np.expand_dims(im,-1)
im_x, im_y, im_w = im.shape
if len(ker.shape)==2: # if the kernel is a 2D array, it is reshaped so it will be applied to all of the image channels
ker = np.tile(np.expand_dims(ker,-1),[1,1,im_w]) # the same kernel will be applied to all of the channels
assert ker.shape[-1]==im.shape[-1], "Kernel and image last dimension must match."
ker_x = ker.shape[0]
ker_y = ker.shape[1]
# shape of the output image
out_x = im_x - ker_x + 1
out_y = im_y - ker_y + 1
# reshapes the image to (out_x, ker_x, out_y, ker_y, im_w)
idx_list_x = gen_idx_conv1d(im_x, ker_x) # computes the indices of a 1D conv (cf. idx_conv1d doc)
idx_list_y = gen_idx_conv1d(im_y, ker_y)
idx_reshaped_x = repeat_idx_2d(idx_list_x, len(idx_list_y), 0) # repeats the previous indices to be used in 2D (cf. repeat_idx_2d doc)
idx_reshaped_y = repeat_idx_2d(idx_list_y, len(idx_list_x), 1)
im_reshaped = np.reshape(im[idx_reshaped_x, idx_reshaped_y, :], [out_x, ker_x, out_y, ker_y, im_w]) # reshapes
# reshapes the 2D kernel
ker = np.reshape(ker,[1, ker_x, 1, ker_y, im_w])
# applies the kernel to the image and reduces the dimension back to the one of original input image
return np.squeeze(np.sum(im_reshaped*ker, axis=(1,3)))
我试图添加很多注释来解释该方法,但是全局想法是将3D输入图像重塑为形状为5D的形状之一(output_image_height,kernel_height,output_image_width,kernel_width,output_image_channel),然后直接应用内核使用基本的数组乘法。当然,这种方法会占用更多的内存(在执行过程中,图像的大小将乘以kernel_height * kernel_width),但是速度更快。
要执行此整形步骤,我“过度使用”了numpy数组的索引方法,尤其是将numpy数组作为numpy数组的索引的可能性。
该方法还可以用于使用基本数学函数在Pytorch或Tensorflow中重新编码2D卷积积,但是我毫不怀疑地说它会比现有的nn.conv2d运算符慢...
>我真的很喜欢仅使用numpy基本工具对这种方法进行编码。
答案 4 :(得分:0)
尝试先进行第一轮,然后转换为uint8:
data = data.round().astype(np.uint8);
答案 5 :(得分:0)
最明显的方法之一就是对内核进行硬编码。
img = img.convert('L')
a = np.array(img)
out = np.zeros([a.shape[0]-2, a.shape[1]-2], dtype='float')
out += a[:-2, :-2]
out += a[1:-1, :-2]
out += a[2:, :-2]
out += a[:-2, 1:-1]
out += a[1:-1,1:-1]
out += a[2:, 1:-1]
out += a[:-2, 2:]
out += a[1:-1, 2:]
out += a[2:, 2:]
out /= 9.0
out = out.astype('uint8')
img = Image.fromarray(out)
此示例对3x3的盒子模糊效果进行了完全展开。您可以将具有不同值的值相乘,然后除以不同的数量。但是,如果您真的想要最快,最肮脏的方法,那就是它。我认为它比Guillaume Mougeot的方法高5倍。他的方法比其他方法高10倍。
如果您执行类似高斯模糊的操作,可能会损失一些步骤。并且需要增加一些东西。
答案 6 :(得分:0)
我检查了许多实现,但没有找到适合我的目的,这应该非常简单。所以这是一个简单的for循环实现
def convolution2d(image, kernel, stride, padding):
image = np.pad(image, [(padding, padding), (padding, padding)], mode='constant', constant_values=0)
kernel_height, kernel_width = kernel.shape
padded_height, padded_width = image.shape
output_height = (padded_height - kernel_height) // stride + 1
output_width = (padded_width - kernel_width) // stride + 1
new_image = np.zeros((output_height, output_width)).astype(np.float32)
for y in range(0, output_height):
for x in range(0, output_width):
new_image[y][x] = np.sum(image[y * stride:y * stride + kernel_height, x * stride:x * stride + kernel_width] * kernel).astype(np.float32)
return new_image
答案 7 :(得分:-2)
此代码不正确:
for r in range(nr):
data[r,:] = np.convolve(data[r,:], H_r, 'same')
for c in range(nc):
data[:,c] = np.convolve(data[:,c], H_c, 'same')
参见Nussbaumer从多维卷积转换为一维卷积。