如何找到覆盖另一个列表中所有元素所需的最少数量的列表

时间:2016-03-31 03:51:04

标签: algorithm matlab combinatorics set-theory set-cover

我正在使用Matlab处理代码,我需要在其中找到覆盖引用列表所有元素所需的最少数字列表(在一些给定列表中)。

例如,假设我的参考列表是

X = [0 1 2 3 4 5 6 7 8 9] 

我有一组给定的列表如下:

A = [0 1 3 5 6 7 9]
B = [0 1 2 3 4]
C = [5 6 7 8 9]
D = [1 2 3 4]
E = [1 5 7 8]

涵盖X中每个元素所需的最少数量的列表是2BC),但是,如果我最初只搜索列表,涵盖了大多数元素(A),然后尝试查找将涵盖其余元素的其他列表,我最终将使用至少3列表。编写可以搜索所需的最少数量的列表的代码的最佳方法是什么(它会给我输出BC)?任何帮助都将非常感激...即使只是一个概念性解释(而非实际代码)如何最好地解决这个问题将是一个巨大的帮助!

2 个答案:

答案 0 :(得分:4)

方法#1:所有可能组合的迭代“暴力”

下面是一个可能的算法,说明如何解决问题。代码本身应该是不言自明的,但我们的想法是我们测试所有可能的列表组合,直到找到有效的列表(因此我们没有遇到您描述的问题,我们错误地根据它们的长度选择列表)。 / p>

function varargout = q36323802
R = [0 1 2 3 4 5 6 7 8 9]; %// Reference List

L = {... // As per Dan's suggestion:
 [0 1 3 5 6 7 9]
 [0 1 2 3 4]
 [5 6 7 8 9]
 [1 2 3 4]
 [1 5 7 8]
 };
out = []; %// Initialize output
%% // Brute-force approach:
nLists = numel(L);
for indN = 1:nLists
  setCombinationsToCheck = nchoosek(1:nLists,indN);  
  for indC = 1:size(setCombinationsToCheck,1)
    u = unique(cat(2,L{setCombinationsToCheck(indC,:)}));
    if all(ismember(R,u))
      out = setCombinationsToCheck(indC,:);
      disp(['The minimum number of required sets is ' num2str(indN) ...
      ', and their indices are: ' num2str(out)]);
      return;
    end
  end
end
disp('No amount of lists found to cover the reference.');

if nargout > 0 
  varargout{1} = out;
end

对于您的示例,输出为:

  

The minimum number of required sets is 2, and their indices are: 2 3

注(S):

  1. 此方法通过在迭代n-1中不使用长度为n的列表来执行一些冗余计算,这些列表已在先前的迭代中找到(适用时)。在这种情况下,递归解决方案可能有效。
  2. 可能有一种方法可以对此进行矢量化,我并没有深入思考。
  3. 我假设所有输入都是行向量。如果情况并非如此,则必须有一些额外的步骤。
  4. 感谢 Adiel 提出了一些改进建议,感谢Amro找到了一些错误!

    方法#2:树搜索 实验

    我也试图构建一个递归求解器。现在它找到了一个解决方案,但它不够通用(实际上问题是它只返回第一个结果,不一定是最好的结果)。这种方法背后的原因是我们可以将您的问题视为树搜索问题,因此我们可以使用搜索/寻路算法(请参阅BFSDFSIDS等)。我认为下面的算法最接近DFS。和以前一样,这应该主要说明解决问题的方法。

    function q36323802_DFS(R,L)
    %% //Input checking:
    if nargin < 2 || isempty(L)
      L = {... // As per Dan's suggestion:
       [0 1 3 5 6 7 9]
       [0 1 2 3 4]
       [5 6 7 8 9]
       [1 2 3 4]
       [1 5 7 8]
       };
    end
    if nargin < 1 || isempty(R)
      R = [0 1 2 3 4 5 6 7 8 9]; %// Reference List
    end
    %% // Algorithm (DFS: breadth-first search):
    out = DFS_search(R,L,0);
    if isempty(out)
      disp('No amount of lists found to cover the reference.');
    else
      disp(['The minimum number of required sets is ' num2str(numel(out)) ...
        ', and their indices are: ' num2str(out)]);
    end
    end
    
    function out = DFS_search(R,L,depth)
      %// Check to see if we should stop:
      if isempty(R) || isempty(L)
        % // Backtrack here?
        out = [];
        return;
      end
      if isnan(R)
        out = [];
        return;
      end
    
      nLists = numel(L);
      reducedR = cellfun(@(R,L)setdiff(R,L),repmat({R},[nLists,1]),L,'UniformOutput',false)';
     %'//  We consider a case where the reduction had no effect as "hopeless" and
      %// "drop" it.
      isFullCoverage = cellfun(@isempty,reducedR);
      isHopeless = cellfun(@(R)all(isnan(R)),reducedR) | cellfun(@(rR)isequal(rR,R),reducedR);
      reducedR(isHopeless) = deal({NaN});
      if all(isHopeless) && ~any(isFullCoverage)
        out = [];
        return
      end
      if any(isFullCoverage) %// Check current "breadth level"
        out = find(isFullCoverage,1,'first');
        return
      else
        for indB = 1:nLists
          out = DFS_search(reducedR{indB},L,depth+1);
          if ~isempty(out)
            out = [indB out]; %#ok
            %// TODO: test if one of the sets is covered by the others and remove it
            %// from the list "out".
            %// Also, keep track of the best path and only return (finally) if shortest
            return
          end
        end
      end  
    end
    

答案 1 :(得分:1)

通过Amro对Dev-iL&#39; 1 st 方法的类似解决方案:

function varargout = q36323802A
R = [0 1 2 3 4 5 6 7 8 9];

names = {'A' 'B' 'C' 'D' 'E'};
L = {...
    [0 1 3 5 6 7 9]
    [0 1 2 3 4]
    [5 6 7 8 9]
    [1 2 3 4]
    [1 5 7 8]
};
N = numel(L);

%// powerset of L: set of all subsets (excluding empty set)
powerset = cell(1,N);
for k=1:N
    sets = nchoosek(1:N, k);
    powerset{k} = num2cell(sets,2);
end
powerset = cat(1, powerset{:});

%// for each possible subset, check if it covers the target R
mask = false(size(powerset));
for i=1:numel(powerset)
    elems = unique([L{powerset{i}}]);
    mask(i) = all(ismember(R, elems));
end
if ~any(mask), error('cant cover target'); end

%// from candidates, choose the one with least amount of sets
candidates = powerset(mask);
len = cellfun(@numel, candidates);
[~,idx] = min(len);
out = candidates{idx};
varargout{1} = names(out);