Question

我的目标是从两段文字中找到类似的短语。

我知道常用词会成为一个问题。例如，and the we are the。在这种情况下，我认为有必要使用过滤器。

我想知道这是否是一个好方法？这使用递归，如果找到匹配，它会看到下一个单词是否也匹配，并继续，直到没有匹配。

  1. the cat is on the roof
  2. a man is on the stage

  A1 = [the, cat, is, on, the, roof]
  A2 = [a, man, is, on, the, stage]

  [the]: no match
  [cat]: no match
  [is]: match
  [is, on]: match
  [is, on, the]: match
  [is, on, the, roof]: no match
  [on]: match
  [on, the]: match
  [on, the, roof]: no match
  [the]: match
  [the, roof]: no match
  [roof]: no match
  -end-

Answer 1

Google上的快速搜索显示我this website包含您问题的解决方案：

它的工作原理是找到两者共有的最长的单词序列字符串，并递归地找到最长的序列字符串的剩余部分，直到子字符串没有相同的单词。此时它将剩余的新单词添加为插入和剩下的旧词作为删除。

找到两段文字之间的匹配短语？

1 个答案: