Question

是否可以在PostgresSQL全文搜索中赋予权重或提高查询条件？

例如，在使用关键字“ vehicle”，“ bus”进行搜索时...会将“ vehicle”排在较高的行中，而将“ bus”排在较低的行。

Answer 1

您必须使用setweight上的tsvector函数为向量的某些元素分配特定的权重：

setweight(to_tsvector('english', '...'), 'A', '{vehicl}')

然后，ts_rank函数会将出现的vehicle计数为高。

Answer 2

另一种可能性是将多个查询等级组合为一个。它涵盖了除每个术语放松/增强功能以外的其他放松技术-请注意我示例中的简单字典。它也可以在PostgreSQL的SaaS版本上使用，在该版本中，您不能随意使用自定义扩展名/字典。

内置排名功能仅是示例。您可以编写自己的排名函数和/或将其结果与其他因素结合起来以满足您的特定需求。

PostgreSQL docs - Controlling Text Search

CREATE TABLE animal
(
    name text
);

insert into animal
values ('cat-dog'),
       ('cat dog'),
       ('dog-cat'),
       ('dog cat'),
       ('cats and dogs'),
       ('dogs and cats'),
       ('cat'),
       ('dog');

create index animal_fulltext_idx on animal using gist (to_tsvector('english', coalesce(name, '')));

SELECT name,
       to_tsvector('english', coalesce(name, '')) @@ to_tsquery('english', 'cats & dogs') as original_query_matches,
       ts_rank(to_tsvector('english', coalesce(name, '')), to_tsquery('english', 'cats & dogs')) as original_query_rank,
       to_tsvector('simple', coalesce(name, '')) @@ to_tsquery('simple', 'cats & dogs') as strict_query_matches,
       ts_rank(to_tsvector('simple', coalesce(name, '')), to_tsquery('simple', 'cats & dogs')) * 1.05 as strict_query_rank,
       to_tsvector('simple', coalesce(name, '')) @@ to_tsquery('simple', 'cats <2> dogs') as extra_strict_query_matches,
       ts_rank(to_tsvector('simple', coalesce(name, '')), to_tsquery('simple', 'cats <2> dogs')) * 1.15 as extra_strict_query_rank,
       to_tsvector('english', coalesce(name, '')) @@ to_tsquery('english', 'cats | dogs') as relaxed_query_matches,
       ts_rank(to_tsvector('english', coalesce(name, '')), to_tsquery('english', 'cats | dogs')) * 0.95 as relaxed_query_rank,
       greatest(
               CASE
                   WHEN to_tsvector('english', coalesce(name, '')) @@ to_tsquery('english', 'cats & dogs')
                       THEN ts_rank(to_tsvector('english', coalesce(name, '')), to_tsquery('english', 'cats & dogs'))
                   ELSE 0 END,
               CASE
                   WHEN to_tsvector('simple', coalesce(name, '')) @@ to_tsquery('simple', 'cats & dogs')
                       THEN ts_rank(to_tsvector('simple', coalesce(name, '')), to_tsquery('simple', 'cats & dogs')) * 1.05
                   ELSE 0 END,
               CASE
                   WHEN to_tsvector('simple', coalesce(name, '')) @@ to_tsquery('simple', 'cats <2> dogs')
                       THEN ts_rank(to_tsvector('simple', coalesce(name, '')), to_tsquery('simple', 'cats <2> dogs')) * 1.15
                   ELSE 0 END,
               CASE
                   WHEN to_tsvector('english', coalesce(name, '')) @@ to_tsquery('english', 'cats | dogs')
                       THEN ts_rank(to_tsvector('english', coalesce(name, '')), to_tsquery('english', 'cats | dogs')) * 0.95
                   ELSE 0 END
           ) as greatest_rank
FROM animal
where to_tsvector('english', coalesce(name, '')) @@ (to_tsquery('english', 'cats & dogs') || to_tsquery('english', 'cats <-> dogs') || to_tsquery('english', 'cats | dogs'))
order by greatest_rank desc;

ts_rank计算非零数字，即使查询不“匹配”，即为什么存在case表达式。
中的“匹配”查询结果在哪里应该是所有“匹配”查询结果的超集对查询进行排名。
当心“ coalesce”功能-如果在查询中使用它而不在索引中使用它，那么将不使用索引。
to_tsvector必须指定配置以使其不可变。
已索引的ts_vector配置和与@@一起使用的ts_vector的配置必须与要使用的索引相同。
如果将setweight与@@运算符一起使用但不在索引中，则将不使用索引。

在PostgreSQL 12.2上测试

（如果有人知道如何通过更简单/更快的查询来达到相同的结果，我会很想听听。）

如何在PostgreSQL全文搜索中提高查询条件？

2 个答案: