我创建了下面的查询,该查询显示了价格以及来自同一个表的过去价格的一种delta指数(真实查询在不同的日期间隔使用多个子查询,因此我优先避免使用多个JOIN):
SELECT
H1.`item_id`,
H1.`date`,
H1.`price`,
(SELECT AVG(H2.price)/H1.`price`
FROM hive_item_price H2 FORCE INDEX (date_id)
WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +12 hour) AND H1.`date`) AS fDelta12hrs,
(SELECT AVG(H2.price)/H1.`price`
FROM hive_item_price H2 FORCE INDEX (date_id)
WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +48 hour) AND H1.`date`) AS fDelta48hrs
FROM hive_item_price H1
WHERE H1.id = 3915328
它运行良好,但我必须强制INDEX,因为MySQL不使用它,否则它会非常慢。一旦我在WHERE子句中指定了多于一行(即“WHERE H1.id IN(3915328,3915044)”VS“WHERE H1.id = 3915328”),问题就会开始。
...
WHERE H1.id IN (3915328,3915044)
它改变了查询计划并变得非常慢(它就像1 VS 10000的比率!)。索引接缝错误使用。我的目标是以百万价格运行这个:)。我使用了explain函数,但无法弄清楚如何获得类似的查询计划或类似的性能。
以下是快速运行的查询计划(使用“WHERE H1.id = 3915328”只有1行):
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
|1 | PRIMARY | H1 | const | PRIMARY | PRIMARY | 8 | const | 1 |
|2 | DEPENDENT SUBQUERY | H2 | range | date_id | date_id | 16 | {null}| 61 | Using where
此处从“WHERE H1.id = 3915328”更改为“WHERE H1.id IN(3915328,3915044)”时的新计划:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
| 1 | PRIMARY | H1 | range | PRIMARY | PRIMARY | 8 | {null} | 2 | Using where
| 2 | DEPENDENT SUBQUERY | H2 | ref | date_id | date_id | 8 | tvlr_old.H1.item_id | 19578 | Using where
数据如下所示:
id item_id price date
3915328 4 94,00 21/06/2013 10:24:03
3915044 4 93,00 21/06/2013 10:12:03
3914761 4 92,00 21/06/2013 10:00:03
3914475 4 92,00 21/06/2013 09:48:03
3914189 4 91,00 21/06/2013 09:36:03
3913905 4 91,00 21/06/2013 09:24:03
3913620 4 91,00 21/06/2013 09:12:03
3913335 4 90,00 21/06/2013 09:00:03
3913050 4 90,00 21/06/2013 08:48:03
3912764 4 90,00 21/06/2013 08:36:03
感谢您的帮助。
答案 0 :(得分:0)
你能试试这个问题吗?出于好奇:
SELECT
H1.id,
AVG(H2.price)/H1.`price` AS fDelta48
AVG(H3.price)/H1.`price` AS fDelta24
FROM
hive_item_price H1
JOIN hive_item_price H2 ON
H2.item_id = H1.item_id
AND H2.bee_hive_id = H1.bee_hive_id
AND H2.date BETWEEN DATE_SUB(H1.`date`, INTERVAL +48 HOUR) AND H1.`date`
JOIN hive_item_price H3 ON
H3.item_id = H1.item_id
AND H3.bee_hive_id = H1.bee_hive_id
AND H3.date BETWEEN DATE_SUB(H1.`date`, INTERVAL +24 HOUR) AND H1.`date`
WHERE
H1.id IN (3915328, 3915044)
GROUP BY
H1.id;
答案 1 :(得分:0)
考虑到仅针对1行/ id的查询版本比2+行/ ID版本快1000多,并且在这种情况下我无法避免MySQL的错误查询计划:最快我目前在多个ID /行中找到的解决方案是使用一个游标,它将为每个id运行1行查询。
DROP TABLE IF EXISTS tempPrices;
CREATE TEMPORARY TABLE tempPrices
(
iId INT unsigned NOT NULL,
dDateCollected datetime,
fPrice FLOAT,
fDelta12hrs FLOAT,
fDelta48hrs FLOAT
)ENGINE=MEMORY;
DROP PROCEDURE IF EXISTS pricefcloop;
CREATE PROCEDURE pricefcloop()
BEGIN
DECLARE curr_id INT;
DECLARE cur1 CURSOR FOR SELECT id FROM hive_item_price WHERE id IN (3915328, 3915044, ....);
OPEN cur1;
read_loop: LOOP
FETCH cur1 INTO curr_id;
INSERT INTO tempPrices (iId, dDateCollected, fPrice, fDelta12hrs, fDelta48hrs)
SELECT
H1.`item_id`,
H1.`date`,
H1.`price`,
(SELECT AVG(H2.price)/H1.`price`
FROM hive_item_price H2 FORCE INDEX (date_id)
WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +12 hour) AND H1.`date`) AS fDelta12hrs,
(SELECT AVG(H2.price)/H1.`price`
FROM hive_item_price H2 FORCE INDEX (date_id)
WHERE H2.item_id = H1.item_id AND H2.bee_hive_id = H1.bee_hive_id
AND H2.date BETWEEN DATE_SUB(H1.`date`, interval +48 hour) AND H1.`date`) AS fDelta48hrs
FROM hive_item_price H1
WHERE H1.id = curr_id;
END LOOP;
CLOSE cur1;
END;
CALL pricefcloop();
SELECT * FROM tempPrices;