数百万条记录的SQL战略&每个的多个类别

时间:2013-11-10 01:56:10

标签: mysql sql

我有一个超过130万条记录的数据库。这些记录中的每一个都有一个或多个与之关联的类别(或标签),方法是使用类别匹配表,该表保存与placeid匹配的一行或多行与catid。我正在使用全文搜索来查找地名和/或fullloc(完整位置)字段中的匹配项。

正如您将在下面的查询中看到的,我使用“内部联接”和“分组依据”来处理类别的匹配。一个地方可能有近40个类别。搜索130万条记录时,这非常慢。

SELECT 
  MATCH(placename,fullloc) AGAINST ('chappaqua school' IN BOOLEAN MODE) 
    as score, 
  places.PlaceName, places.fullloc, places.PlaceID, places.ImageThumb, 
  places.contributorid, places.verified, places.verified_by, 
  places.verified_date, places.visibility 
FROM places 
WHERE STATUS IN (1,0) 
  AND MATCH(placename,fullloc) AGAINST ('chappaqua school' IN BOOLEAN MODE) 
  AND places_cats.CATID IN (129,75,104,126,115,140,128,137,136,114,135,
    105,142,141,90,121,122,117,98,1,130,127,25,116,102,5,
    88,87,31,24,37,134,39,40,112,34,30,133,9,8,7,11,20,19,
    3,2,4,131,132,89,38,125,139,36,124,138,35,119,118,120,
    96,97,95,13,17,14,12,21,15,32,16,26,29,28,27,85,86,84) 
  AND (visibility = 1 ) 
GROUP BY PlaceID 
ORDER BY score DESC LIMIT 500

我正在寻找一些帮助,找到一种处理类别的新方法。有没有办法通过将CatID存储在Places表中来实现?

以下是上述查询的EXPLAIN ....

1,SIMPLE,places,fulltext,"PRIMARY,by_plaeid,by_full_text",by_full_text,0,,1,Using where; Using temporary; Using filesort
1,SIMPLE,places_cats,ref,"by_PlaceIDandCatID,by_catid,by_catID_and_placeID,by_placeid",by_placeid,4,mapthepast.places.PlaceID,1,Using where

上述查询目前需要10.8秒。删除Category JOIN(下面),执行时间降至2.08。

SELECT MATCH(placename,fullloc) AGAINST ('chappaqua school' IN BOOLEAN MODE) as score, 
places.PlaceName, places.fullloc, places.PlaceID, places.ImageThumb, places.contributorid, places.verified, places.verified_by, places.verified_date, places.visibility 
FROM places 
WHERE STATUS IN (1,0) AND MATCH(placename,fullloc) AGAINST ('chappaqua school' IN BOOLEAN MODE) 
GROUP BY PlaceID 
ORDER BY score DESC LIMIT 500

0 个答案:

没有答案