我有一个超过130万条记录的数据库。这些记录中的每一个都有一个或多个与之关联的类别(或标签),方法是使用类别匹配表,该表保存与placeid匹配的一行或多行与catid。我正在使用全文搜索来查找地名和/或fullloc(完整位置)字段中的匹配项。
正如您将在下面的查询中看到的,我使用“内部联接”和“分组依据”来处理类别的匹配。一个地方可能有近40个类别。搜索130万条记录时,这非常慢。
SELECT
MATCH(placename,fullloc) AGAINST ('chappaqua school' IN BOOLEAN MODE)
as score,
places.PlaceName, places.fullloc, places.PlaceID, places.ImageThumb,
places.contributorid, places.verified, places.verified_by,
places.verified_date, places.visibility
FROM places
WHERE STATUS IN (1,0)
AND MATCH(placename,fullloc) AGAINST ('chappaqua school' IN BOOLEAN MODE)
AND places_cats.CATID IN (129,75,104,126,115,140,128,137,136,114,135,
105,142,141,90,121,122,117,98,1,130,127,25,116,102,5,
88,87,31,24,37,134,39,40,112,34,30,133,9,8,7,11,20,19,
3,2,4,131,132,89,38,125,139,36,124,138,35,119,118,120,
96,97,95,13,17,14,12,21,15,32,16,26,29,28,27,85,86,84)
AND (visibility = 1 )
GROUP BY PlaceID
ORDER BY score DESC LIMIT 500
我正在寻找一些帮助,找到一种处理类别的新方法。有没有办法通过将CatID存储在Places表中来实现?
以下是上述查询的EXPLAIN ....
1,SIMPLE,places,fulltext,"PRIMARY,by_plaeid,by_full_text",by_full_text,0,,1,Using where; Using temporary; Using filesort
1,SIMPLE,places_cats,ref,"by_PlaceIDandCatID,by_catid,by_catID_and_placeID,by_placeid",by_placeid,4,mapthepast.places.PlaceID,1,Using where
上述查询目前需要10.8秒。删除Category JOIN(下面),执行时间降至2.08。
SELECT MATCH(placename,fullloc) AGAINST ('chappaqua school' IN BOOLEAN MODE) as score,
places.PlaceName, places.fullloc, places.PlaceID, places.ImageThumb, places.contributorid, places.verified, places.verified_by, places.verified_date, places.visibility
FROM places
WHERE STATUS IN (1,0) AND MATCH(placename,fullloc) AGAINST ('chappaqua school' IN BOOLEAN MODE)
GROUP BY PlaceID
ORDER BY score DESC LIMIT 500