MySQL子查询性能和索引问题在1行VS范围内

时间:2013-11-14 21:09:01

标签: mysql sql

我创建了下面的查询,该查询显示了价格以及来自同一个表的过去价格的一种delta指数(真实查询在不同的日期间隔使用多个子查询,因此我优先避免使用多个JOIN):

SELECT
H1.`item_id`,
H1.`date`,
H1.`price`,
(SELECT AVG(H2.price)/H1.`price`
    FROM hive_item_price H2 FORCE INDEX (date_id)
    WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
    AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +12 hour) AND H1.`date`) AS fDelta12hrs,
(SELECT AVG(H2.price)/H1.`price`
    FROM hive_item_price H2 FORCE INDEX (date_id)
    WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
    AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +48 hour) AND H1.`date`) AS fDelta48hrs
FROM hive_item_price H1
WHERE H1.id = 3915328

它运行良好,但我必须强制INDEX,因为MySQL不使用它,否则它会非常慢。一旦我在WHERE子句中指定了多于一行(即“WHERE H1.id IN(3915328,3915044)”VS“WHERE H1.id = 3915328”),问题就会开始。

...
WHERE H1.id IN (3915328,3915044)

它改变了查询计划并变得非常慢(它就像1 VS 10000的比率!)。索引接缝错误使用。我的目标是以百万价格运行这个:)。我使用了explain函数,但无法弄清楚如何获得类似的查询计划或类似的性能。

以下是快速运行的查询计划(使用“WHERE H1.id = 3915328”只有1行):

| id | select_type          | table | type  | possible_keys | key     | key_len | ref   | rows | Extra
|1   | PRIMARY              | H1    | const | PRIMARY       | PRIMARY | 8       | const | 1    |
|2   | DEPENDENT SUBQUERY   | H2    | range | date_id       | date_id | 16      | {null}| 61   | Using where

此处从“WHERE H1.id = 3915328”更改为“WHERE H1.id IN(3915328,3915044)”时的新计划:

| id | select_type        | table | type  | possible_keys | key     | key_len  | ref                 | rows   | Extra
| 1  | PRIMARY            | H1    | range | PRIMARY       | PRIMARY | 8        | {null}               | 2     | Using where
| 2  | DEPENDENT SUBQUERY | H2    | ref   | date_id       | date_id | 8        | tvlr_old.H1.item_id | 19578 | Using where

数据如下所示:

id      item_id price date
3915328 4       94,00 21/06/2013 10:24:03
3915044 4       93,00 21/06/2013 10:12:03
3914761 4       92,00 21/06/2013 10:00:03
3914475 4       92,00 21/06/2013 09:48:03
3914189 4       91,00 21/06/2013 09:36:03
3913905 4       91,00 21/06/2013 09:24:03
3913620 4       91,00 21/06/2013 09:12:03
3913335 4       90,00 21/06/2013 09:00:03
3913050 4       90,00 21/06/2013 08:48:03
3912764 4       90,00 21/06/2013 08:36:03

感谢您的帮助。

2 个答案:

答案 0 :(得分:0)

你能试试这个问题吗?出于好奇:

SELECT 
    H1.id,
    AVG(H2.price)/H1.`price` AS fDelta48
    AVG(H3.price)/H1.`price` AS fDelta24
FROM 
    hive_item_price H1
        JOIN hive_item_price H2 ON 
                H2.item_id = H1.item_id 
            AND H2.bee_hive_id = H1.bee_hive_id
            AND H2.date BETWEEN DATE_SUB(H1.`date`, INTERVAL +48 HOUR) AND H1.`date`
        JOIN hive_item_price H3 ON 
                H3.item_id = H1.item_id 
            AND H3.bee_hive_id = H1.bee_hive_id
            AND H3.date BETWEEN DATE_SUB(H1.`date`, INTERVAL +24 HOUR) AND H1.`date`
WHERE 
    H1.id IN (3915328, 3915044)
GROUP BY
    H1.id;

答案 1 :(得分:0)

考虑到仅针对1行/ id的查询版本比2+行/ ID版本快1000多,并且在这种情况下我无法避免MySQL的错误查询计划:最快我目前在多个ID /行中找到的解决方案是使用一个游标,它将为每个id运行1行查询。

DROP TABLE IF EXISTS tempPrices;
CREATE TEMPORARY TABLE tempPrices
(
  iId INT unsigned NOT NULL,
  dDateCollected datetime,
  fPrice FLOAT,
  fDelta12hrs FLOAT,
  fDelta48hrs FLOAT
)ENGINE=MEMORY;

DROP PROCEDURE IF EXISTS pricefcloop;

CREATE PROCEDURE pricefcloop()
BEGIN
  DECLARE curr_id INT;
  DECLARE cur1 CURSOR FOR SELECT id FROM hive_item_price WHERE id IN (3915328, 3915044, ....);

  OPEN cur1;

  read_loop: LOOP
    FETCH cur1 INTO curr_id;
    INSERT INTO tempPrices (iId, dDateCollected, fPrice, fDelta12hrs, fDelta48hrs)
      SELECT
      H1.`item_id`,
      H1.`date`,
      H1.`price`,
      (SELECT AVG(H2.price)/H1.`price`
          FROM hive_item_price H2 FORCE INDEX (date_id)
          WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
          AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +12 hour) AND H1.`date`) AS fDelta12hrs,
      (SELECT AVG(H2.price)/H1.`price`
          FROM hive_item_price H2 FORCE INDEX (date_id)
          WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
          AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +48 hour) AND H1.`date`) AS fDelta48hrs
      FROM hive_item_price H1
      WHERE H1.id = curr_id;
  END LOOP;

  CLOSE cur1;
END;

CALL pricefcloop();

SELECT * FROM tempPrices;