Question

我使用pg_similarity扩展来检查值的相似性。现在我有一些单词而不是文本作为值。我试过smlar扩展名：

select smlar( a.tokenizedsentence, b.tokenizedSentence ) from nlpdata a, nlpdata b;

但得到了错误：

ERROR:  function smlar(character varying, character varying) does not exist
LINE 1: select smlar(a.tokenizedsentence, b.tokenizedsentence) from ...
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

然后我尝试了：

select smlar(a.tokenizedsentence::varchar[], b.tokenizedsentence::varchar[]) from nlpdata a, nlpdata b;

得到了：

ERROR:  malformed array literal: "0 0 0 1 1 0 0 0 0 1"
DETAIL:  Array value must start with "{" or dimension information.

尝试搜索矢量的任何postgres扩展，但找不到任何。有关某些扩展的任何想法或信息吗？

编辑：现在计算，但是答案始终是0.707或1，即使它是错误的。

Answer 1

如果字符串符合数组文字格式（例如，{a,b,c}），则只能将字符串直接转换为数组类型。

但一般来说，您使用各种supporting functions构造数组。在您的情况下，您可能需要以下内容：

smlar(string_to_array(a.tokenizedsentence, ' '), string_to_array(b.tokenizedsentence, ' '))

postgres中向量的相似性检查

1 个答案: