postgres中全文搜索的问题

时间:2018-03-04 02:19:22

标签: postgresql full-text-search

我有下一张表和数据:

/* script for people table, with field tsvector and gin */

CREATE TABLE public.people (
  id INTEGER,
  name VARCHAR(30),
  lastname VARCHAR(30),
  complete TSVECTOR
) 
WITH (oids = false);

CREATE INDEX idx_complete ON public.people
  USING gin (complete);

/* data for people table */

INSERT INTO public.people ("id", "name", "lastname", "complete")
VALUES 
  (1, 'MICHAEL', 'BRYANT BRYANT', '''bryant'':2,3 ''michael'':1'),
  (2, 'HENRY STEVEN', 'BUSH TIESSEN', '''bush'':3 ''henri'':1 ''steven'':2 ''tiessen'':4'),
  (3, 'WILLINGTON STEVEN', 'STEPHENS FLINN', '''flinn'':4 ''stephen'':3 ''steven'':2 ''willington'':1'),
  (4, 'BRET', 'MARTINEZ AROCH', '''aroch'':3 ''bret'':1 ''martinez'':2'),
  (5, 'TERENCE BERT', 'CAVALIERE ENRON', '''bert'':2 ''cavalier'':3 ''terenc'':1');

我需要根据tsvector字段检索姓名和姓氏。其实我有查询:

SELECT * FROM people WHERE complete @@ to_tsquery('WILLINGTON & FLINN');

结果是对的(第三条记录)。但如果我尝试

SELECT * FROM people WHERE complete @@ to_tsquery('STEVEN & FLINN');
/* the same record! */

我没有结果。为什么?我该怎么办?

2 个答案:

答案 0 :(得分:0)

您应该使用相同的语言来搜索您的表格,因为您的字段中的值会完成'在哪里插入。

检查该查询的结果,比较英语和德语:

select * ,
to_tsvector('english', concat_ws(' ', name, lastname )) as english,
to_tsvector('german', concat_ws(' ', name, lastname )) as german
from public.people 

所以应该适合你:

SELECT * FROM people WHERE complete @@ to_tsquery('english','STEVEN & FLINN');

答案 1 :(得分:0)

您可能正在使用文本搜索配置,其中STEVENFLINN通过词干修改。

我可以在这里重现:

test=> SHOW default_text_search_config;
 default_text_search_config 
----------------------------
 pg_catalog.german
(1 row)

test=> SELECT complete FROM public.people WHERE id = 3;
                    complete                     
-------------------------------------------------
 'flinn':4 'stephen':3 'steven':2 'willington':1
(1 row)

test=> SELECT * FROM ts_debug('STEVEN & FLINN');
   alias   |   description   | token  | dictionaries  | dictionary  | lexemes 
-----------+-----------------+--------+---------------+-------------+---------
 asciiword | Word, all ASCII | STEVEN | {german_stem} | german_stem | {stev}
 blank     | Space symbols   |        | {}            |             | 
 blank     | Space symbols   | &      | {}            |             | 
 asciiword | Word, all ASCII | FLINN  | {german_stem} | german_stem | {flinn}
(4 rows)

test=> SELECT * FROM public.people
       WHERE complete @@ to_tsquery('STEVEN & FLINN');
 id | name | lastname | complete 
----+------+----------+----------
(0 rows)

所以你看,德国雪球词典STEVENstev

由于complete包含未提取的版本steven,因此找不到匹配项。

在填充complete和查询时,您应该使用相同的文本搜索配置。