我在python中为CNN im编程编写了以下代码。它不是非常有效,我在跨度为1的512x512图像上运行5x5内核,没有填充,整个卷积需要5-6秒的时间。是否有人对此有优化的解决方案,有效的替代方案(模块或软件包)或其他有关减少运行时间的建议?下面的代码
y = np.random.uniform(low = 0, high = 5, size = (5, 5))
x = np.random.uniform(low = 0.1, high = 3, size = (512, 512))
res = np.zeros(((x.shape[0]-y.shape[0]+1),(x.shape[0]-y.shape[0]+1)), dtype = "float16")
k = y.shape[0]
m = x.shape[0]
n = x.shape[1]
for i in range(m-(k-1)):
for j in range((n-(k-1))):
sum = 0
for ki in range(k):
for kj in range(k):
data = x[i+ki][j+kj]
kval = y[ki][kj]
sum = sum + data*kval
res[i][j] = sum
print(res)