Question

我试图找到一种方法来加快代码，以解决我一直在努力解决的问题。代码的基本思想非常简单。有五个输入：

四个1xm（对于一些m＆lt; n，它们可以是不同大小）矩阵（A，B，C，D）是成对不相交的{1,2，...，n}子集和一个nxn对称二进制矩阵（M）。代码的基本思想是检查每个元素组合的不等式，如果不等式成立，则返回导致它保持的值，即：

    for a = A
      for b = B
        for c = C
          for d = D
            if M(a,c) + M(b,d) < M(a,d) + M(b,c)
              result = [a b c d];
              return
            end
          end
        end
      end
    end

我知道必须有更好的方法来做到这一点。首先，由于它是对称的，因此我可以减少一半检查的项目，因为M（a，b）= M（b，a）。我一直在研究矢量化，发现了几个我从未听说过MATLAB的函数（因为我相对较新），但我找不到任何能够特别帮助我解决这个问题的函数。问题。我已经想到了解决这个问题的其他方法，但没有任何方法可以完善，而且我现在还不知道该怎么做。

例如，我可以将其分为两种情况： 1）右边是1：然后我必须检查左侧的两个术语是否为0。 2）右边是2：然后我必须检查左侧至少有一个术语是0。

但是，再一次，我无法避免筑巢。

感谢您提供的所有帮助。谢谢！

Answer 1

您在这里提出两个问题：（1）是否有更有效的算法来执行此搜索，以及（2）如何在MATLAB中对此进行矢量化。第一个是非常有趣的想法，但可能有点超出了本论坛的范围。第二个更容易回答。

正如您在问题下方的评论中所指出的，您可以通过枚举所有可能性并将它们全部一起检查来对for循环进行矢量化，并且答案from this question可以提供帮助：

[a,b,c,d] = ndgrid(A,B,C,D);     % Enumerate all combos
a=a(:); b=b(:); c=c(:); d=d(:);  % Reshape from 4-D matrices to vectors
ac = sub2ind(size(M),a,c);       % Convert subscript pairs to linear indices
bd = sub2ind(size(M),b,d);
ad = sub2ind(size(M),a,d);
bc = sub2ind(size(M),b,c);
mask = (M(ac) + M(bd) < M(ad) + M(bc));     % Test the inequality
results = [a(mask), b(mask), c(mask), d(mask)]; % Select the ones that pass

同样，这不是算法更改：它仍然具有与嵌套for循环相同的复杂性。矢量化可能会使它运行得更快，但它也缺乏提前终止，所以在某些情况下它可能会更慢。

Answer 2

如果您可以访问神经网络工具箱，combvec可能会对此有所帮助。

运行allCombs = combvec(A,B,C,D)将为您提供一个（4 m1*m2*m3*m4}矩阵，如下所示：

[...
a1, a1, a1, a1, a1 ... a1... a2... am1;
b1, b1, b1, b1, b1 ... b2... b1... bm2;
c1, c1, c1, c1, c2 ... c1... c1... cm3;
d1, d2, d3, d4, d1 ... d1... d1... dm4]

然后，您可以使用sub2ind和矩阵索引来设置不等式所需的两个值：

indices = [sub2ind(size(M),allCombs(1,:),allCombs(3,:));
            sub2ind(size(M),allCombs(2,:),allCombs(4,:));
            sub2ind(size(M),allCombs(1,:),allCombs(4,:));
            sub2ind(size(M),allCombs(2,:),allCombs(3,:))];

testValues = M(indices);
testValues(5,:) = (testValues(1,:) + testValues(2,:) < testValues(3,:) + testValues(4,:))

可以通过说

来检索您的最终a,b,c,d索引

allCombs(:,find(testValues(5,:)))

这将打印一个矩阵，其中包含不等式为真的所有列。

This article可能有用。

Answer 3

由于M是二进制的，我们可以将其视为图形问题。 i，j中的{1..n}与节点对应，M(i,j)表示是否存在连接它们的无向边。

由于A，B，C，D是不相交的，因此可以简化问题。我们可以分阶段解决问题：

查找存在(c,d)的所有a，以便M(a,c) < M(a,d)。让我们调用此集CD_lt_a，（C * D的子集，以使某些a的＆＃34;小于＆＃34;不等式成立）。
查找(c,d)所有a M(a,c) <= M(a,d)，CD_le_a，并调用此集b。
重复CD_lt_b，为M(b,d) < M(b,c)形成CD_le_b，为M(b,d)<=M(b,c)形成M(a,c) < M(a,d)。
满足整体不平等的一种方法是M(b,d) <= M(b,c)和CD_lt_a，因此我们可以查看CD_le_b和M(a,c) <= M(a,d)的交集。
另一种方式是M(b,d) < M(b,c)和CD_le_a，请查看CD_lt_b和(c,d)的交集。
知道(a,b)后，我们可以返回找% 0. Some preliminaries % Get the size of each set mA = numel(A); mB = numel(B); mC = numel(C); mD = numel(D); % 1. Find all (c,d) for which there exists a such that M(a,c) < M(a,d) CA_linked = M(C,A); AD_linked = M(A,D); CA_not_linked = ~CA_linked; % Multiplying these matrices tells us, for each (c,d), how many nodes % in A satisfy this M(a,c)<M(a,d) inequality % Ugh, we need to cast to double to use the matrix multiplication CD_lt_a = (CA_not_linked * double(AD_linked)) > 0; % 2. For M(a,c) <= M(a,d), check that the converse is false for some a AD_not_linked = ~AD_linked; CD_le_a = (CA_linked * double(AD_not_linked)) < mA; % 3. Repeat for b CB_linked = M(C,B); BD_linked = M(B,D); CD_lt_b = (CB_linked * double(~BD_linked)) > 0; CD_le_b = (~CB_linked * double(BD_linked)) < mB; % 4. Find the intersection of CD_lt_a and CD_le_b - this is one way % to satisfy the inequality M(a,c)+M(b,d) < M(a,d)+M(b,c) CD_satisfy_ineq_1 = CD_lt_a & CD_le_b; % 5. The other way to satisfy the inequality is CD_le_a & CD_lt_b CD_satisfy_ineq_2 = CD_le_a & CD_lt_b; inequality_feasible = any(CD_satisfy_ineq_1(:) | CD_satisfy_ineq_2(:));。

所以我的实现是：

A*C*D + B*C*D

请注意，如果可行性是您唯一关心的问题，您可以在此处停止。复杂度为A*B*C*D，这比for循环的最坏情况a,b,c,d复杂度要好。但是，提前终止意味着在某些情况下，嵌套for循环可能仍然更快。

下一个代码块列举了满足不等式的所有% 6. With (c,d) known, find a and b % We can define these functions to help us search find_a_lt = @(c,d) find(CA_not_linked(c,:)' & AD_linked(:,d)); find_a_le = @(c,d) find(CA_not_linked(c,:)' | AD_linked(:,d)); find_b_lt = @(c,d) find(CB_linked(c,:)' & ~BD_linked(:,d)); find_b_le = @(c,d) find(CB_linked(c,:)' | ~BD_linked(:,d)); % I'm gonna assume there aren't too many results, so I will be appending % to an array inside of a for loop. Bad for performance, but maybe a bit % more readable for a StackOverflow answer. results = zeros(0,4); % Find those that satisfy it the first way [c_list,d_list] = find(CD_satisfy_ineq_1); for ii = 1:numel(c_list) c = c_list(ii); d = d_list(ii); a = find_a_lt(c,d); b = find_b_le(c,d); % a,b might be vectors, in which case all combos are valid % Many ways to find all combos, gonna use ndgrid() [a,b] = ndgrid(a,b); % Append these to the growing list of results abcd = [a(:), b(:), repmat([c d],[numel(a),1])]; results = [results; abcd]; end % Repeat for the second way [c_list,d_list] = find(CD_satisfy_ineq_2); for ii = 1:numel(c_list) c = c_list(ii); d = d_list(ii); a = find_a_le(c,d); b = find_b_lt(c,d); % a,b might be vectors, in which case all combos are valid % Many ways to find all combos, gonna use ndgrid() [a,b] = ndgrid(a,b); % Append these to the growing list of results abcd = [a(:), b(:), repmat([c d],[numel(a),1])]; results = [results; abcd]; end % Remove duplicates results = unique(results, 'rows'); % And actually these a,b,c,d will be indices into A,B,C,D because they % were obtained from calling find() on submatrices of M. if ~isempty(results) results(:,1) = A(results(:,1)); results(:,2) = B(results(:,2)); results(:,3) = C(results(:,3)); results(:,4) = D(results(:,4)); end。它没有很好地优化（它附加到循环内的矩阵），所以如果有很多结果，它可能会很慢。

m = 1000;
A = (1:m); B = A(end)+(1:m); C = B(end)+(1:m); D = C(end)+(1:m);
M = rand(D(end),D(end)) < 1e-6; M = M | M';

我在以下测试用例中对此进行了测试：

ndgrid

我喜欢认为第一部分（看看不平等是否适用于任何a，b，c，d）效果非常好。其他矢量化答案（使用combvec或c,d枚举a，b，c，d的所有组合）将需要8 TB的内存来解决此大小的问题！

但是，当有超过几百find_a_lt = @(c,d) find(CA_not_linked(c,:)' & AD_linked(:,d), 1, 'first'); find_a_le = @(c,d) find(CA_not_linked(c,:)' | AD_linked(:,d), 1, 'first'); find_b_lt = @(c,d) find(CB_linked(c,:)' & ~BD_linked(:,d), 1, 'first'); find_b_le = @(c,d) find(CB_linked(c,:)' | ~BD_linked(:,d), 1, 'first'); if any(CD_satisfy_ineq_1) [c,d] = find(CD_satisfy_ineq_1, 1, 'first'); a = find_a_lt(c,d); b = find_a_le(c,d); result = [A(a), B(b), C(c), D(d)]; elseif any(CD_satisfy_ineq_2) [c,d] = find(CD_satisfy_ineq_2, 1, 'first'); a = find_a_le(c,d); b = find_a_lt(c,d); result = [A(a), B(b), C(c), D(d)]; else result = zeros(0,4); end满足不等式时，我不建议运行第二部分（枚举所有结果），因为它会非常缓慢。

P.S。我知道我已经回答了，但答案是关于一般情况下对这些循环进行矢量化，而不是特定于你的特定问题。

P.P.S。这有点让我想起stable marriage problem。也许其中一些参考文献也包含与您的问题相关的算法。我怀疑一个真正的基于图形的算法可能会实现最坏情况的复杂性，同时另外提供早期终止。但我认为在MATLAB中有效地实现基于图形的算法是很困难的。

P.P.P.S。如果您只想要一个可行的解决方案，则可以简化步骤6以仅返回单个值，例如

offScreenPageLimit

四个嵌套for循环优化 - 我保证我搜索

3 个答案: