假设我有这个列的表posts
:
top_totle,title,sub_title,text
我需要对所有这一列进行全文搜索并按相关性排序,其中top_title需要比标题等更重要。
所以我有2个问题是相同的,为此制作索引的最佳方法是什么以及如何格式化查询以最好地支持此索引?
索引选项: 我可以在此列的所有列上创建组合的全文索引,或为每列
创建单独的索引这是首选的方式? 选项1:
SELECT
title,
MATCH (top_title) AGAINST ('text' IN BOOLEAN MODE) as toptitle_score,
MATCH (title) AGAINST ('text' IN BOOLEAN MODE) as title_score,
MATCH (sub_text) AGAINST ('text' IN BOOLEAN MODE) as sub_text_score,
FROM
`posts`
WHERE
MATCH (top_title,title , sub_text ) AGAINST ('text' IN BOOLEAN MODE)
and `posts`.`deleted_at` IS NULL
AND `published_at` IS NOT NULL
Order by toptitle_score desc,
Order by title_score desc ,
Order by subtext_score desc
选项2:
SELECT
title,
MATCH (top_title) AGAINST ('text' IN BOOLEAN MODE) as toptitle_score,
MATCH (title) AGAINST ('text' IN BOOLEAN MODE) as title_score,
MATCH (sub_text) AGAINST ('text' IN BOOLEAN MODE) as sub_text_score,
FROM
`posts`
WHERE
(MATCH (top_title) AGAINST ('text' IN BOOLEAN MODE)
or MATCH (title) AGAINST ('text' IN BOOLEAN MODE)
or MATCH (sub_text) AGAINST ('text' IN BOOLEAN MODE))
and `posts`.`deleted_at` IS NULL
AND `published_at` IS NOT NULL
Order by toptitle_score desc,
Order by title_score desc ,
Order by subtext_score desc
选项3:
is there some smarter way?
答案 0 :(得分:1)
选项1很好。它需要4个FT索引(每列一个,加上一个全3列)。不要重复ORDER BY
:
ORDER BY toptitle_score DESC ,
title_score DESC ,
subtext_score DESC
选项2不是一个可行的竞争者。它只需要3个索引(节省不多),但由于OR
而导致速度慢很多。
选项3 ......(选项1,已修复,加上......)
你正在使用的ORDER BY
对你想要的东西可能是“错误的”。例如,它会将text
中没有toptitle
的任何行推送到列表的末尾。也许你想要一些“加权”版本:
ORDER BY
9 * top_title_score +
3 * title_score +
1 * sub_text_score DESC
(9,3,1相当武断。它说如果'text'在title
中显示超过3次,那比在top_title
中出现一次更重要 - 或类似的东西。)