Question

我想知道这样的算法是否存在并在任何数据库中实现（最理想的是Postgres）。

Levenstein匹配字符串，但我想根据匹配字数来比较字符串。例如，拥有：

the quick brown fox jumps over the lazy dog

如果我尝试将其与

匹配，我应该得到 2 的结果

the pen lies over the table

因为它与句子上的“the”和“over”相匹配

Answer 1

这是一个使用数组的SQL方法：

select count(*) from
(
(select distinct unnest(string_to_array(upper('the quick brown fox jumps over the lazy dog'),' ')))
intersect all
(select distinct unnest(string_to_array(upper('the pen lies over the table'),' ')))
) t3

http://sqlfiddle.com/#!12/724f7/6

Answer 2

我相信你也可以使用为字母设计的相同算法。另见this question。通过单词比较并不常见，我很确定PostgreSQL不支持它（我也不知道任何其他数据库）。但是，只要您可以使用数组，如David Aldridge的回答所示，您可以为此编写自己的存储过程。

您可以在Wikibooks上获取算法的灵感，只需替换String List<String>，size()替换length()和char比较对于equals()会有效。现在您可以在SQL中实现相同的功能，您只需要数组分配（如果不能使用二维数组，则需要一些索引算法）。在最坏的情况下，您可以使用临时表而不是数组。

通过匹配单词比较字符串

2 个答案: