我有非常大的矩阵,所以不要通过遍历每一行和列来求和。
a = [[1,2,3],[3,4,5],[5,6,7]]
def neighbors(i,j,a):
return [a[i][j-1], a[i][(j+1)%len(a[0])], a[i-1][j], a[(i+1)%len(a)][j]]
[[np.mean(neighbors(i,j,a)) for j in range(len(a[0]))] for i in range(len(a))]
此代码适用于3x3或小范围的矩阵,但对于像2k x 2k这样的大矩阵,这是不可行的。如果缺少矩阵中的任何值或者像na
此代码适用于3x3或小范围的矩阵,但对于像2k x 2k这样的大矩阵,这是不可行的。如果缺少矩阵中的任何值或者像na
那样,这也不起作用。如果任何邻居值为na
,则跳过该邻居获取平均值
答案 0 :(得分:5)
这假设您希望在窗口为3 x 3
的输入数组中获取滑动窗口平均值,并且仅考虑西北 - 东 - 南邻域元素。
对于这种情况,可以使用具有适当内核的signal.convolve2d
。最后,您需要将这些求和除以内核中的求和数,即kernel.sum()
仅作为对求和的贡献。这是实施 -
import numpy as np
from scipy import signal
# Inputs
a = [[1,2,3],[3,4,5],[5,6,7],[4,8,9]]
# Convert to numpy array
arr = np.asarray(a,float)
# Define kernel for convolution
kernel = np.array([[0,1,0],
[1,0,1],
[0,1,0]])
# Perform 2D convolution with input data and kernel
out = signal.convolve2d(arr, kernel, boundary='wrap', mode='same')/kernel.sum()
这与假设#1中的假设相同,只是我们希望在只有零元素的邻域中找到平均值,并打算用这些平均值替换它们。
方法#1:以下是使用手动选择性卷积方法实现此目的的一种方法 -
import numpy as np
# Convert to numpy array
arr = np.asarray(a,float)
# Pad around the input array to take care of boundary conditions
arr_pad = np.lib.pad(arr, (1,1), 'wrap')
R,C = np.where(arr==0) # Row, column indices for zero elements in input array
N = arr_pad.shape[1] # Number of rows in input array
offset = np.array([-N, -1, 1, N])
idx = np.ravel_multi_index((R+1,C+1),arr_pad.shape)[:,None] + offset
arr_out = arr.copy()
arr_out[R,C] = arr_pad.ravel()[idx].sum(1)/4
示例输入,输出 -
In [587]: arr
Out[587]:
array([[ 4., 0., 3., 3., 3., 1., 3.],
[ 2., 4., 0., 0., 4., 2., 1.],
[ 0., 1., 1., 0., 1., 4., 3.],
[ 0., 3., 0., 2., 3., 0., 1.]])
In [588]: arr_out
Out[588]:
array([[ 4. , 3.5 , 3. , 3. , 3. , 1. , 3. ],
[ 2. , 4. , 2. , 1.75, 4. , 2. , 1. ],
[ 1.5 , 1. , 1. , 1. , 1. , 4. , 3. ],
[ 2. , 3. , 2.25, 2. , 3. , 2.25, 1. ]])
为了处理边界条件,还有其他填充选项。查看numpy.pad
了解更多信息。
方法#2:这将是前面Shot #1
中列出的基于卷积的方法的修改版本。这与之前的方法相同,只是在最后,我们选择性地替换
具有卷积输出的零元素。这是代码 -
import numpy as np
from scipy import signal
# Inputs
a = [[1,2,3],[3,4,5],[5,6,7],[4,8,9]]
# Convert to numpy array
arr = np.asarray(a,float)
# Define kernel for convolution
kernel = np.array([[0,1,0],
[1,0,1],
[0,1,0]])
# Perform 2D convolution with input data and kernel
conv_out = signal.convolve2d(arr, kernel, boundary='wrap', mode='same')/kernel.sum()
# Initialize output array as a copy of input array
arr_out = arr.copy()
# Setup a mask of zero elements in input array and
# replace those in output array with the convolution output
mask = arr==0
arr_out[mask] = conv_out[mask]
备注: Approach #1
是输入数组中零元素数量较少的首选方式,否则请使用Approach #2
。
答案 1 :(得分:3)
这是@Divakar答案下的评论附录(而非独立答案)。
出于好奇,我尝试了针对scipy卷积的不同'伪'卷积。最快的一个是%(模数)包装一个,这让我感到惊讶:显然numpy通过索引做了一些聪明的事情,尽管显然不需要填充会节省时间。fn3 - > 9.5ms,fn1 - > 21ms,fn2 - > 232ms
import timeit
setup = """
import numpy as np
from scipy import signal
N = 1000
M = 750
P = 5 # i.e. small number -> bigger proportion of zeros
a = np.random.randint(0, P, M * N).reshape(M, N)
arr = np.asarray(a,float)"""
fn1 = """
arr_pad = np.lib.pad(arr, (1,1), 'wrap')
R,C = np.where(arr==0)
N = arr_pad.shape[1]
offset = np.array([-N, -1, 1, N])
idx = np.ravel_multi_index((R+1,C+1),arr_pad.shape)[:,None] + offset
arr[R,C] = arr_pad.ravel()[idx].sum(1)/4"""
fn2 = """
kernel = np.array([[0,1,0],
[1,0,1],
[0,1,0]])
conv_out = signal.convolve2d(arr, kernel, boundary='wrap', mode='same')/kernel.sum()
mask = arr == 0.0
arr[mask] = conv_out[mask]"""
fn3 = """
R,C = np.where(arr == 0.0)
arr[R, C] = (arr[(R-1)%M,C] + arr[R,(C-1)%N] + arr[R,(C+1)%N] + arr[(R+1)%M,C]) / 4.0
"""
print(timeit.timeit(fn1, setup, number = 100))
print(timeit.timeit(fn2, setup, number = 100))
print(timeit.timeit(fn3, setup, number = 100))
答案 2 :(得分:1)
使用numpy
和scipy.ndimage
,您可以应用"足迹"它定义了你在哪里寻找每个元素的邻居并将函数应用于这些邻居:
import numpy as np
import scipy.ndimage as ndimage
# Getting neighbours horizontally and vertically,
# not diagonally
footprint = np.array([[0,1,0],
[1,0,1],
[0,1,0]])
a = [[1,2,3],[3,4,5],[5,6,7]]
# Need to make sure that dtype is float or the
# mean won't be calculated correctly
a_array = np.array(a, dtype=float)
# Can specify that you want neighbour selection to
# wrap around at the borders
ndimage.generic_filter(a_array, np.mean,
footprint=footprint, mode='wrap')
Out[36]:
array([[ 3.25, 3.5 , 3.75],
[ 3.75, 4. , 4.25],
[ 4.25, 4.5 , 4.75]])