我有三张表如下:
documents (id, content)
words (id, word)
word_document (word_id, document_id, count)
单词表包含所有文档中出现的所有单词,word_document将单词与文档和文档中该单词的计数相关联。
我想编写一个查询来搜索两个单词,并且只返回两个单词的文档,这两个单词的排序方式是文档中两个单词的总数。
例如
DocA: green apple is not blue
DocB: blue apple is blue
DocC: red apple is red
现在搜索 apple 和 blue 会返回:
DocA, 3
DocB, 2
因为...
DocA contains both words and 3 of them
DocB contains both words and 2 of them
DocC only contains one word
我成功使用了 intersect ,但它没有返回计数总和而没有订单。
答案 0 :(得分:0)
我认为应该这样做:
select a.document_id, a.count + b.count
from
(
select document_id, count
from word_document
where word_id = 'apple'
group by document_id
) a
INNER JOIN
(
select document_id, count
from word_document
where word_id = 'blue'
group by document_id
) b
ON a.document_id = b.document_id
ORDER BY a.count + b.count
答案 1 :(得分:0)
对于那些想要这个的人来说,这只会起作用:
select wd.document_id, (wd.count + d.count) as tcount from word_document as wd
join words as w on w.id = wd.word_id
join
(select document_id, count from word_document
join words on words.id = word_document.word_id
where words.word = "apple") d on d.document_id=wd.document_id
where w.word = "blue" order by tcount desc
您可以从内部查询创建临时表并在其上执行外部。它可以递归地完成更多的单词。