Question

当我使用python实现一个滑动窗口来检测静止图像中的对象时，我开始知道它的好功能：

numpy.lib.stride_tricks.as_strided

因此，我尝试实现一般规则，以避免在更改我需要的滑动窗口大小时可能失败的错误。最后我得到了这个代表：

all_windows = as_strided(x,((x.shape[0] - xsize)/xstep ,(x.shape[1] - ysize)/ystep ,xsize,ysize), (x.strides[0]*xstep,x.strides[1]*ystep,x.strides[0],x.strides[1])

导致4 dim矩阵。前两个表示图像的x和y轴上的窗口数。其他代表窗口的大小（xsize，ysize）

并且step表示两个连续窗口之间的位移。

如果选择平方滑动窗口，此表示法可以正常工作。但我仍然有一个问题，让它适用于e.x的Windows。（128,64），我通常得到与图像无关的数据。

我的代码有什么问题。有任何想法吗？如果有一种更好的方法可以在python中使用滑动窗口来进行图像处理，那么它很好用吗？

由于

Answer 1

查看此问题的答案：Using strides for an efficient moving average filter。基本上大步不是很好的选择，尽管它们有效。

Answer 2

您的代码中存在问题。实际上这个代码适用于2D，没有理由使用多维版本（Using strides for an efficient moving average filter）。以下是固定版本：

A = np.arange(100).reshape((10, 10))
print A
all_windows = as_strided(A, ((A.shape[0] - xsize + 1) / xstep, (A.shape[1] - ysize + 1) / ystep, xsize, ysize),
      (A.strides[0] * xstep, A.strides[1] * ystep, A.strides[0], A.strides[1]))
print all_windows

Answer 3

对于后验性：

这是在函数sklearn.feature_extraction.image.extract_patches中的scikit-learn中实现的。

Answer 4

我有一个类似的用例，我需要在一批多通道图像上创建滑动窗口，最终想出了以下功能。 I've written a more in-depth blog post covering this in regards to manually creating a Convolution layer。该函数实现了滑动窗口，还包括对输入数组进行扩张或添加填充。

函数作为输入：

<!DOCTYPE html> <html> <head> <title>Grade Calculator</title> <script> function doubleMe() { var number; var total; number = document.getElementById('txtnumber').value; number = Number(number); total = number * 2; console.log("number(): " + number + " " + total); number = document.getElementById('txtnumber').value; number = parseInt(number); total = number * 2; console.log("parseInt(): " + number + " " + total); number = document.getElementById('txtnumber').value; number = parseFloat(number); total = number * 2; console.log("parseFloat(): " + number + " " + total); document.getElementById('divOutput').innerHTML = total; } </script> </head> <body> <h1>Grade Calculator</h1> Term 1 Mark: <input type = 'text' id = 'txtnumber' /><br><br> Term 2 Mark: <input type = 'text' id = 'txtnumber' /><br><br> Term 3 Mark: <input type = 'text' id = 'txtnumber' /><br><br> Final Exam Mark: <input type = 'text' id = 'txtnumber' /><br><br> Term 1 Percentage: <input type = 'text' id = 'txtnumber' /><br><br> Term 2 Percentage: <input type = 'text' id = 'txtnumber' /><br><br> Term 3 Percentage: <input type = 'text' id = 'txtnumber' /><br><br> <input type = 'button' value = 'Press Me' onclick = 'doubleMe();' /><br><br> <div id = 'divOutput'></div> </body> </html>

通常，在执行前向卷积时，您不需要执行扩张，因此可以使用以下公式找到输出大小（将 x 替换为输入维度）：

input - Size of (Batch, Channel, Height, Width) output_size - Depends on usage, comments below. kernel_size - size of the sliding window you wish to create (square) padding - amount of 0-padding added to the outside of the (H,W) dimensions stride - stride the sliding window should take over the inputs dilate - amount to spread the cells of the input. This adds 0-filled rows/cols between elements

使用此函数执行卷积的向后传递时，使用步长 1 并将 output_size 设置为前向传递的 x-input 的大小

可以在 at this link 中找到带有使用此函数示例的示例代码。

(x - kernel_size + 2 * padding) // stride + 1

在numpy中使用as_strided函数滑动窗口？

4 个答案: