Question

我有一个字符串：

 sup_pairs = 'BA CE DF EF AE FC GD DA CG EA AB BG'

如何组合具有1对的最后一个字符的对是跟随对的第一个字符到字符串中？新字符串必须包含所有字符＆＃39; A＆＃39; B＆＃39;＆＃39; C＆＃39;＆＃39; D＆＃39;＆＃39;＆＃39; E＆＃39;＆＃39; F＆＃39; ，＆＃39; G＆＃39;，这些字符出现在sup_pairs字符串中。预期的输出应为：

 S1 = 'BAEFCGD' % because BA will be followed by AE in sup_pairs string, so we combine BAE, and so on...we continue the rule to generate S1

 S2 = 'DFCEABG'

如果我有AB，BC和BD，生成的字符串应该是：ABC和ABD。如果对中有任何重复的字符，如：AB BC CA CE。我们将跳过第二个A，我们得到ABCE。

Answer 1

这就像生活中的所有美好事物一样，是一个图形问题。每个字母都是一个节点，每一对都是边缘。

首先，我们必须将您的字符串对转换为数字格式，以便我们可以将字母用作下标。我将使用A=2，B=3，...，G=8：

sup_pairs = 'BA CE DF EF AE FC GD DA CG EA AB BG';
p=strsplit(sup_pairs,' ');
m=cell2mat(p(:));
m=m-'?';
A=sparse(m(:,1),m(:,2),1);

稀疏矩阵A现在是表示我们的对的邻接矩阵（实际上，更像是邻接列表）。如果你查看A的完整矩阵，它看起来像这样：

>> full(A)
ans =
   0   0   0   0   0   0   0   0
   0   0   1   0   0   1   0   0
   0   1   0   0   0   0   0   1
   0   0   0   0   0   1   0   1
   0   1   0   0   0   0   1   0
   0   1   0   0   0   0   1   0
   0   0   0   1   0   0   0   0
   0   0   0   0   1   0   0   0

如您所见，转换为下标BA的边(3,2)等于1.

现在，您可以使用自己喜欢的深度优先搜索（DFS）实现从您选择的起始节点执行图形遍历。从根节点到叶节点的每条路径都代表一个有效的字符串。然后，您将路径转换回字母序列：

treepath=[3,2,6,7,4,8,5];
S1=char(treepath+'?');

Output:
S1 = BAEFCGD

这是DFS的递归实现，可以帮助您实现目标。通常在MATLAB中你不必担心没有达到递归深度的默认限制，但是你在这里找到了哈密顿路径，这是NP完全的。如果你得到任何接近递归限制的地方，计算时间将会非常巨大，以至于增加深度将是你最不担心的事情。

function full_paths = dft_all(A, current_path)
   % A - adjacency matrix of graph
   % current_path - initially just the start node (root)
   % full_paths - cell array containing all paths from initial root to a leaf

   n = size(A, 1);   % number of nodes in graph 
   full_paths = cell(1,0);   % return cell array 

   unvisited_mask = ones(1, n);
   unvisited_mask(current_path) = 0;   % mask off already visited nodes (path)
   % multiply mask by array of nodes accessible from last node in path
   unvisited_nodes = find(A(current_path(end), :) .* unvisited_mask);

   % add restriction on length of paths to keep (numel == n)
   if isempty(unvisited_nodes) && (numel(current_path) == n)
      full_paths = {current_path};   % we've found a leaf node
      return;
   end

   % otherwise, still more nodes to search
   for node = unvisited_nodes
      new_path = dft_all(A, [current_path node]);   % add new node and search
      if ~isempty(new_path)   % if this produces a new path...
         full_paths = {full_paths{1,:}, new_path{1,:}};   % add it to output 
      end
   end

end

对于第15行中路径长度的附加条件，这是正常的深度优先遍历除之外：

   if isempty(unvisited_nodes) && (numel(current_path) == n)

if条件的前半部分isempty(unvisited_nodes)是标准的。如果您只使用条件的这一部分，则无论路径长度如何，您都将获得从起始节点到叶子的所有路径。（因此单元格数组输出。）后半部分(numel(current_path) == n)强制执行路径的长度。

我在这里选择了一个快捷方式，因为n是邻接矩阵中的节点数，在示例中是8而不是7，即字母表中的字符数。但是节点1没有边缘，因为我显然正计划使用一种我从来没有告诉你的技巧。不是从每个节点开始运行DFS以获取所有路径，而是可以创建一个虚拟节点（在本例中为节点1）并从中创建一个边缘到所有其他真实节点。然后，您只需在节点1上调用DFS一次，即可获得所有路径。这是更新的邻接矩阵：

A =
   0   1   1   1   1   1   1   1
   0   0   1   0   0   1   0   0
   0   1   0   0   0   0   0   1
   0   0   0   0   0   1   0   1
   0   1   0   0   0   0   1   0
   0   1   0   0   0   0   1   0
   0   0   0   1   0   0   0   0
   0   0   0   0   1   0   0   0

如果您不想使用此技巧，可以将条件更改为n-1，或更改邻接矩阵不包括节点1.请注意，如果您确实将节点1保留，则需要从结果路径中删除它。

这是使用更新矩阵的函数输出：

>> dft_all(A, 1)
ans =
{
  [1,1] =

     1   2   3   8   5   7   4   6

  [1,2] =

     1   3   2   6   7   4   8   5

  [1,3] =

     1   3   8   5   2   6   7   4

  [1,4] =

     1   3   8   5   7   4   6   2

  [1,5] =

     1   4   6   2   3   8   5   7

  [1,6] =

     1   5   7   4   6   2   3   8

  [1,7] =

     1   6   2   3   8   5   7   4

  [1,8] =

     1   6   7   4   8   5   2   3

  [1,9] =

     1   7   4   6   2   3   8   5

  [1,10] =

     1   8   5   7   4   6   2   3

}

组合字符串中的对（Matlab）

1 个答案: