我需要帮助制作一个高级的Postgres查询。我试图找到两个相邻的单词的句子,直接使用Postgres,而不是一些命令语言扩展。我的表是:
TABLE word (spelling text, wordid serial)
TABLE sentence (sentenceid serial)
TABLE item (sentenceid integer, position smallint, wordid integer)
我有一个简单的查询来查找单个单词的句子:
SELECT DISTINCT sentence.sentenceid
FROM item,word,sentence
WHERE word.spelling = 'word1'
AND item.wordid = word.wordid
AND sentence.sentenceid = item.sentenceid
我想依次用其他单词( word2 )过滤该查询的结果,其对应项目的 item.sentenceid 等于当前查询结果&# 39; s( item 或句子)' s sentenceid ,其中 item.position 等于当前查询结果 item.position + 1 。如何以高效的方式优化我的查询以实现此目标?
答案 0 :(得分:1)
我认为这符合您的要求,抱歉但我现在不记得如何在不使用join子句的情况下编写它。基本上,我包括一个自我加入项目和单词表,以获得每个项目的句子上的下一项。如果查询规划器不喜欢我的嵌套选择,你也可以尝试连接单词表。
SELECT distinct sentence.sentenceid
FROM item inner join word
on item.wordid = word.wordid
inner join sentence
on sentence.sentenceid = item.sentenceid
left join (select sentence.sentenceid,
item.position,
word.spelling from subsequent_item
inner join subsequent_word
on item.wordid = word.wordid) subsequent
on subsequent.sentenceid = item.sentenceid
and subsequent.position = item.position +1
where word.spelling = 'word1' and subsequent.spelling = 'word2';
答案 1 :(得分:1)
更简单的解决方案,但仅在item.position
s:
SELECT DISTINCT sentence.sentenceid
FROM sentence
JOIN item ON sentence.sentenceid = item.sentenceid
JOIN word ON item.wordid = word.wordid
JOIN item AS next_item ON sentence.sentenceid = next_item.sentenceid
AND next_item.position = item.position + 1
JOIN word AS next_word ON next_item.wordid = next_word.wordid
WHERE word.spelling = 'word1'
AND next_word.spelling = 'word2'
更一般的解决方案,使用window functions:
SELECT DISTINCT sentenceid
FROM (SELECT sentence.sentenceid,
word.spelling,
lead(word.spelling) OVER (PARTITION BY sentence.sentenceid
ORDER BY item.position)
FROM sentence
JOIN item ON sentence.sentenceid = item.sentenceid
JOIN word ON item.wordid = word.wordid) AS pairs
WHERE spelling = 'word1'
AND lead = 'word2'
修改:也是一般解决方案(允许间隙),但仅限加入:
SELECT DISTINCT sentence.sentenceid
FROM sentence
JOIN item ON sentence.sentenceid = item.sentenceid
JOIN word ON item.wordid = word.wordid
JOIN item AS next_item ON sentence.sentenceid = next_item.sentenceid
AND next_item.position > item.position
JOIN word AS next_word ON next_item.wordid = next_word.wordid
LEFT JOIN item AS mediate_word ON sentence.sentenceid = mediate_word.sentenceid
AND mediate_word.position > item.position
AND mediate_word.position < next_item.position
WHERE mediate_word.wordid IS NULL
AND word.spelling = 'word1'
AND next_word.spelling = 'word2'
答案 2 :(得分:1)
select
*
from mytable
where
round( 0.1 / ts_rank_cd( to_tsvector(mycolumn), to_tsquery('word1 & word2') ) <= 1
这实际上会有效,假设您没有使用A-D重量标签,否则您需要将0.1更改为其他内容。
你也想要添加一个tsvector @@ tsquery where子句。