小选择子查询将整个查询减慢大约10次

时间:2012-11-30 11:04:41

标签: mysql database-performance

我有以下MySQL表:

CREATE TABLE IF NOT EXISTS `pics` (
  `id` mediumint(8) unsigned NOT NULL auto_increment,
  `bnb_id` mediumint(7) unsigned NOT NULL,
  `img_path` varchar(128) NOT NULL,
  `img_path_gallery` varchar(128) NOT NULL,
  `img_path_thumb_small` varchar(128) NOT NULL,
  `img_path_thumb_large` varchar(128) NOT NULL,
  `img_path_thumb_grid` varchar(128) NOT NULL,
  `title` varchar(80) NOT NULL,
  `order` tinyint(2) NOT NULL,
  `upload_date` datetime NOT NULL,
  `state` enum('LOCAL','S3') NOT NULL default 'LOCAL',
  `is_cover` tinyint(1) unsigned default NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `bnb_id_2` (`bnb_id`,`is_cover`),
  KEY `bnb_id` (`bnb_id`),
  KEY `is_cover` (`is_cover`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=30371 ;

is_cover是我创建的一个字段,用于为每个bnb_id仅选择一张图片:当图片被选为封面时设置为1,否则设置为NULL。我需要将LEFT JOIN表格改为另一个,让我们称之为bnb;每个pics条目的bnb表中可能有多行(bnb_id上有一个参照完整性绑定),但在这种情况下,我必须从{{{0}中仅提取一行1}}表,因此需要pics coulmn和所有索引(我试过的每个其他解决方案产生的查询持续10到50秒)。

即使在这种情况下,查询速度非常慢,并且在is_cover表中约10000个元素的数据池和bnb中的30000个数据池上执行任何操作都需要5到8秒。表。从pics = 1的表中选择非常快速和直接,但是当放入更大的查询时,一切都会崩溃。

is_cover

(前面带_的字符串是实际数值)

SELECT subbnb.*, 3956 * 2 * ASIN( SQRT( POWER( SIN((_LAT - abs(lat)) * pi()/180 / 2), 2) + COS(_LAT * pi()/180 ) * COS(abs(lat) * pi()/180) * POWER( SIN((_LNG - abs(lng)) * pi()/180 / 2), 2) ) ) AS distance, prices.price, pics.img_path_thumb_grid, reviews.count reviewsCount, likes.count likesCount FROM (SELECT bnb.*, bnbdata_a.*, pos.lat, pos.lng FROM bnb JOIN bnbdata ON (bnb.id = bnbdata.bnb_id) JOIN positions pos ON (bnb.id = pos.bnb_id) ) subbnb LEFT JOIN ( SELECT * FROM pics WHERE is_cover = 1 ) pics ON (subbnb.id = pics.bnb_id) LEFT JOIN (SELECT price, bnb_id FROM prices WHERE category = "DAILY") prices ON (subbnb.id = prices.bnb_id) LEFT JOIN (SELECT COUNT(*) AS count, bnb_id FROM reviews GROUP BY bnb_id) reviews ON (subbnb.id = reviews.bnb_id) LEFT JOIN (SELECT COUNT(*) AS count, bnb_id FROM likes GROUP BY bnb_id) likes ON (subbnb.id = likes.bnb_id) WHERE lng BETWEEN _LNGA AND _LNGB AND lat BETWEEN _LATA AND _LATB HAVING distance < 10 ORDER BY distance LIMIT 0, 25 查询会产生以下结果:

EXPLAIN

看起来MySQL(id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY <derived5> system NULL NULL NULL NULL 0 const row not found 1 PRIMARY <derived6> system NULL NULL NULL NULL 0 const row not found 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 10522 Using where; Using temporary; Using filesort 1 PRIMARY <derived3> ALL NULL NULL NULL NULL 7040 1 PRIMARY <derived4> ALL NULL NULL NULL NULL 1 6 DERIVED likes index NULL PRIMARY 6 NULL 1 Using index 5 DERIVED reviews index NULL bnb_id 5 NULL 1 Using index 4 DERIVED prices ALL NULL NULL NULL NULL 1 Using where 3 DERIVED pics ref is_cover is_cover 2 11760 Using where 2 DERIVED pos ALL PRIMARY NULL NULL NULL 10543 2 DERIVED bnbdata eq_ref PRIMARY PRIMARY 3 db.pos.bnb_id 1 2 DERIVED bnb eq_ref PRIMARY PRIMARY 3 db.pos.bnb_id 1 ,id 4)忽略了is_cover索引,但是当我对Using where表运行小选择时,同样的事情发生了很快。我无法在此查询中找到瓶颈,将JOIN移除到pics会使速度更快,但JOINed子查询本身速度非常快,其余的大查询也是如此 - 即使使用数学运算也是如此开头的代码永远不会超过2秒的执行时间。

有人知道瓶颈在哪里,以及如何解决这个问题?

1 个答案:

答案 0 :(得分:1)

你可以尝试使用像这样的连接重建你的查询(对不起,如果不正确,但你只描述了一个表):

SELECT
  bnb.*, bnbdata_a.*, 
  pos.lat, pos.lng
  3956 * 2 * ASIN(
    SQRT(
      POWER(
        SIN((_LAT - abs(lat)) * pi()/180 / 2), 
      2) +
      COS(_LAT * pi()/180 ) * 
      COS(abs(lat) * pi()/180) * 
      POWER(
        SIN((_LNG - abs(lng)) * pi()/180 / 2), 
      2) 
    )
  ) AS distance,
  prices.price,
  pics.img_path_thumb_grid,
  reviews.count reviewsCount,
  likes.count likesCount
FROM bnb
JOIN bnbdata 
  ON bnb.id = bnbdata.bnb_id
JOIN positions pos 
  ON bnb.id = pos.bnb_id
LEFT JOIN pics 
  ON bnb.id = pics.bnb_id AND pics.is_cover = 1
LEFT JOIN prices 
  ON bnb.id = prices.bnb_id 
LEFT JOIN (SELECT COUNT(*) AS count, bnb_id FROM reviews GROUP BY bnb_id) reviews
  ON bnb.id = reviews.bnb_id
LEFT JOIN (SELECT COUNT(*) AS count, bnb_id FROM likes GROUP BY bnb_id) likes
  ON bnb.id = likes.bnb_id
WHERE
  lng BETWEEN _LNGA AND _LNGB AND lat BETWEEN _LATA AND _LATB AND distance < 10
ORDER BY distance
LIMIT 0, 25

或者像那样重建:

SELECT tmp_bnb.*,
  pics.img_path_thumb_grid,
  reviews.count reviewsCount,
  likes.count likesCount 
FROM     
  (
    SELECT
      bnb.*, bnbdata_a.*, 
      pos.lat, pos.lng
      3956 * 2 * ASIN(
      SQRT(
        POWER(
          SIN((_LAT - abs(lat)) * pi()/180 / 2), 
        2) +
        COS(_LAT * pi()/180 ) * 
        COS(abs(lat) * pi()/180) * 
        POWER(
          SIN((_LNG - abs(lng)) * pi()/180 / 2), 
        2) 
      )
      ) AS distance,
      prices.price
    FROM bnb
    JOIN bnbdata 
      ON bnb.id = bnbdata.bnb_id
    JOIN positions pos 
      ON bnb.id = pos.bnb_id
    WHERE
      lng BETWEEN _LNGA AND _LNGB AND lat BETWEEN _LATA AND _LATB AND distance < 10
    ORDER BY distance
    LIMIT 0, 25
  ) as tmp_bnb
LEFT JOIN pics 
  ON tmp_bnb.id = pics.bnb_id AND pics.is_cover = 1
LEFT JOIN prices 
  ON tmp_bnb.id = prices.bnb_id 
LEFT JOIN (SELECT COUNT(*) AS count, bnb_id FROM reviews GROUP BY bnb_id) reviews
  ON tmp_bnb.id = reviews.bnb_id
LEFT JOIN (SELECT COUNT(*) AS count, bnb_id FROM likes GROUP BY bnb_id) likes
  ON tmp_bnb.id = likes.bnb_id

或者您可以将查询拆分为两个,在第一个查询中您可以获得基本信息,然后您可以获得其他信息,例如rewiews计数和likes计数。

我还认为一个好主意是将reviews_counterlikes_counter添加到bnb表中,并且不是每次都计算一次,而是每隔一段时间(小时maby)计算一次,或者使用插入触发器。另外还要添加新列cover_pic_id来保存bnb表中的封面图片的ID

让我知道表现如何。