Question

我需要获取给定标题的所有描述的每个单词的计数。我尝试使用ts_stat()但是它给了我完整列的单词频率（无论它属于哪个标题）。 select ts_stat($$select to_tsvector('simple', posts.description) from posts$$); 寻找有关创建新表的帮助，每行包含标题，单词，计数。
最初我考虑将行创建为（标题，逗号分隔的单词及其计数）作为列，但获取给定标题的单词计数可能需要一些额外的工作，所以想到添加一个新行每个标题的每个单词。

如果有更好的方法可以让我知道。

version: PostgreSQL 9.5.8

Answer 1

我想不出任何不那么怪物了

t=# with c as (
  select to_tsvector('simple',unnest(string_to_array(description,' '))),title
  from posts
)
, d as (
  select translate(split_part(to_tsvector::text,':',1),$$'$$,'') ts,title
  from c
  where octet_length(to_tsvector::text) > 0
)
select ts,title,count(1)
from d
group by title,ts
order by 1;
     ts      |   title    | count
-------------+------------+-------
 a           |  title1    |     1
 about       |  title1    |     1
 about       |  title2    |     1
 description |  title1    |     2
 description |  title2    |     1
 different   |  title1    |     1
 from        |  title1    |     1
 is          |  title2    |     2
 is          |  title1    |     1
 other       |  title2    |     1
 previous    |  title1    |     1
 short       |  title1    |     1
 some        |  title2    |     1
 this        |  title1    |     1
 this        |  title2    |     2
 title1      |  title1    |     1
 title2      |  title2    |     1
(17 rows)

与...协调：

t=# select ts_stat('select to_tsvector($$simple$$,description) from posts') order by 1 ;
      ts_stat
-------------------
 (a,1,1)
 (about,2,2)
 (description,3,3)
 (different,1,1)
 (from,1,1)
 (is,3,3)
 (other,1,1)
 (previous,1,1)
 (short,1,1)
 (some,1,1)
 (this,3,3)
 (title1,1,1)
 (title2,1,1)
(13 rows)

但是再一次 - 我对FTS的体验非常有限 - 可能你可以用ts_functions做得更好

如何为每一行运行ts_stat

1 个答案: