Question

假设您有一个1000x1000正整数权重网格W。

我们希望找到最小化平均加权距离的单元格。

执行此操作的强力方法是循环每个候选单元格并计算距离：

int best_x, best_y, best_dist;

for x0 = 1:1000,
    for y0 = 1:1000,

        int total_dist = 0;

        for x1 = 1:1000,
            for y1 = 1:1000,
                total_dist += W[x1,y1] * sqrt((x0-x1)^2 + (y0-y1)^2);

        if (total_dist < best_dist)
            best_x = x0;
            best_y = y0;
            best_dist = total_dist;

这需要大约10 ^ 12次操作，这太长了。

有没有办法在~10 ^ 8左右的操作中或附近执行此操作？

Answer 1

理论

这可以使用O（n m log nm）时间的滤波器，其中n，m是网格尺寸。

您需要定义大小为2n + 1 x 2m + 1的过滤器，并且您需要（居中）将原始权重网格嵌入到大小为3n x 3m的零网格中。过滤器需要与(n,m)处的原点距离加权：

 F(i,j) = sqrt((n-i)^2 + (m-j)^2)

让W表示嵌入大小为3n x 3m的零网格中的原始权重网格（居中）。

然后过滤（cross-correlation）结果

 R = F o W

将为您提供total_dist网格，只需点击min R（忽略您在W中添加的额外嵌入的零）即可找到最佳x0, y0位置。

图像（即网格）过滤非常标准，可以使用imfilter命令在各种不同的现有软件（如matlab）中完成。

我应该注意，虽然我明确地使用了上面的互相关，但只有因为你的过滤器F是对称的，你才会得到与convolution相同的结果。一般来说，图像滤波器是互相关的，而不是卷积，尽管这两个操作非常类似。

O（nm log nm）运行时间的原因是因为可以使用2D FFT进行图像滤波。

实行

以下是Matlab中的两种实现，两种方法的最终结果相同，并且它们以非常简单的方式进行基准测试：

m=100;
n=100;
W0=abs(randn(m,n))+.001;

tic;

%The following padding is not necessary in the matlab code because
%matlab implements it in the imfilter function, from the imfilter
%documentation:
%  - Boundary options
% 
%        X            Input array values outside the bounds of the array
%                     are implicitly assumed to have the value X.  When no
%                     boundary option is specified, imfilter uses X = 0.

%W=padarray(W0,[m n]);

W=W0;
F=zeros(2*m+1,2*n+1);

for i=1:size(F,1)
    for j=1:size(F,2)
        %This is matlab where indices start from 1, hence the need
        %for m-1 and n-1 in the equations
        F(i,j)=sqrt((i-m-1)^2 + (j-n-1)^2);
    end
end
R=imfilter(W,F);
[mr mc] = ind2sub(size(R),find(R == min(R(:))));
[mr, mc]
toc;

tic;
T=zeros([m n]);
best_x=-1;
best_y=-1;
best_val=inf;
for y0=1:m
    for x0=1:n

        total_dist = 0;

        for y1=1:m
            for x1=1:n
                total_dist = total_dist + W0(y1,x1) * sqrt((x0-x1)^2 + (y0-y1)^2);
            end
        end

        T(y0,x0) = total_dist;
        if ( total_dist < best_val ) 
            best_x = x0;
            best_y = y0;
            best_val = total_dist;
        end

    end
end
[best_y best_x]
toc;

diff=abs(T-R);
max_diff=max(diff(:));
fprintf('The max difference between the two computations: %g\n', max_diff);

效果

对于800x800网格，在我的PC上肯定不是最快的，FFT方法仅在700秒内评估。几个小时之后蛮力方法没有完成，我不得不杀死它。

在进一步提升性能方面，您可以通过迁移到GPU等硬件平台来实现这些目标。例如，使用CUDA's FFT library，您可以在CPU上花费的时间的一小部分上计算2D FFT。关键的一点是，当你投入更多的硬件来进行计算时，FFT方法会缩放，而暴力方法的扩展会更糟。

观察

在实施此操作时，我发现几乎每次best_x,bext_y值都是floor(n/2)+-1之一。这意味着距离项很可能在整个计算中占主导地位，因此，您只需要计算total_dist的值仅4个值，这使得该算法变得微不足道！

最小化到加权网格的距离

1 个答案:

理论

实行

效果

观察