MySQL全文:结果更重

时间:2013-03-21 00:19:44

标签: mysql full-text-search

我有一个城市和州的数据库(约43,000)。我这样做全文搜索:

select city, state, match(city, state_short, state) against (:q in boolean mode) as score
from zipcodes where
match(city, state_short, state) against (:q in boolean mode)
group by city, state order by score desc limit 6

当我用有意义的字符串替换:q时,它会起作用,但我可以说我搜索houston texas,我希望结果是第一个,但它是第3个:

  • North Houston, Texas
  • South Houston, Texas
  • Houston, Texas

如何使Houston, Texas比其他2重?对于像这样的其他城市来说,这显然应该是一样的。

修改

这有用吗,有什么想法吗?

SELECT * FROM (
    SELECT city, state, MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE) as score
    FROM zipcodes
    WHERE MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE)
    GROUP BY city, state
    ORDER BY score DESC LIMIT 6
) AS tbl
ORDER BY score DESC, LENGTH(city)

1 个答案:

答案 0 :(得分:1)

您的新查询可能有效,但完全是间接的。像ORDER BY LENGTH(city)这样的东西会更好,而不是ORDER BY ABS(LENGTH(:q) - (LENGTH(city) + LENGTH(state)))。这并不完美,但它应该更好,因为任何与输入和高分相同长度的东西可能都是你正在寻找的东西。最终的查询看起来像这样:

SELECT city, state, MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE) AS score
FROM zipcodes
WHERE MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE)
GROUP BY city, state
ORDER BY score DESC, ABS(LENGTH(:q) - (LENGTH(city) + LENGTH(state))) DESC LIMIT 6

我将新的ORDER BY子句移动到主查询中以删除子查询。这应该产生相同(或可能更准确)的结果。

Levenshtein距离可能是一个更准确的衡量标准,但在MySQL中没有本地实现它。 This post提供了有关Levenshtein距离函数的MySQL实现的更多信息。