我有一个抓取工具,可以扫描网页中的所有字词。然后它将每个单词与它所属的url一起插入到mysql数据库中。然后根据文档中找到的单词数量对搜索进行排序。问题是......如何在现有查询中添加多个术语查询。
它非常适合单项查询,但我希望我的查询在同一网页中尝试查找单词,如果网页中没有任何单词,则返回正常条款的结果。
我的查询如下:
$results = addslashes( $_POST['results'] );
" SELECT p.page_url AS url,
COUNT(*) AS occurrences
FROM page p, word w, occurrence o
WHERE p.page_id = o.page_id AND
w.word_id = o.word_id AND
w.word_word = \"$keyword\"
GROUP BY p.page_id
ORDER BY occurrences DESC
LIMIT $results"
答案 0 :(得分:0)
使用COUNT(DISTINCT ...)
计算每页上找到的不同字词的数量,并使用IN
查找任何字词列表:
SELECT
p.page_url AS url,
COUNT(DISTINCT w.word_word) AS words_found
COUNT(*) AS occurrences
FROM page p
JOIN occurrence o ON p.page_id = o.page_id
JOIN word w ON w.word_id = o.word_id
WHERE w.word_word IN ('foo', 'bar')
GROUP BY p.page_id
ORDER BY occurrences DESC
如果您想确保页面上至少有n个搜索字词,请添加HAVING子句:
GROUP BY p.page_id
HAVING COUNT(DISTINCT w.word_word) >= 2
ORDER BY occurrences DESC
答案 1 :(得分:0)
如果数据库引擎支持,您可以进行子选择。例如:
SELECT
url,
(select count(*) from table where conditions1) as count1,
(select count(*) from table where conditions2) as count2
FROM table