在MySQL上为巨大数据选择查询优化

时间:2013-08-03 12:13:56

标签: mysql select query-optimization union

## MySql Server 5.5,数据库引擎MyIsam,表fact_transaction在date_key,time_key,unit_cost_price,unit_retail_price(组合键)上编制索引,对于fact_stockout_sales也是相同的,不包括time_key。##

Query plan

Query plan

查询

SELECT 
    t.Level, t.Name, t.KeyValue, 
    ROUND( (SUM(t.Gross)/SUM(t.Revenue))*100, 2 ) AS Value, 
    ROUND( (SUM(t.adjustedGross)/SUM(t.adjustedRevenue))*100, 2 ) AS adjustedValue,
    t.dataType AS dataType 
FROM 
   (SELECT "item" AS Level, ds.product_name AS Name, ds.product_id AS KeyValue, 
        SUM(ft.gross_profit) AS Gross, 
        SUM(ft.selling_amount) AS Revenue, 
        SUM(ft.adjusted_gross_profit) AS adjustedGross, 
        SUM(ft.adjusted_selling_amount) AS adjustedRevenue,
        "%" AS dataType 
    FROM fact_transaction AS ft 
    JOIN dim_sku AS ds ON ft.sku_key = ds.sku_key 
    WHERE ft.date_key BETWEEN 20080215 AND 20130107 
      AND ft.time_key BETWEEN 100 AND 235900 
      AND ft.unit_cost_price BETWEEN 0 AND 1333 
      AND ft.unit_retail_price BETWEEN 0 AND 16500 
      AND ft.store_key IN ("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16") 
      AND ds.product_id IN (1312009,1312007,... Huge List say 30000) 
      AND ds.category IN ("Male","Female","Unisex") 
      AND ft.day_of_week IN ("1","2","3","4","5","6","7") 
      AND ds.collection_name IN ("Base","SS12","AW12") 
    GROUP BY ds.product_id                          
    UNION 
    SELECT "item" AS Level, ds.product_name AS Name, ds.product_id AS KeyValue, 
        SUM(ft.gross_profit) AS Gross, 
        SUM(ft.selling_amount) AS Revenue, 
        SUM(ft.adjusted_gross_profit) AS adjustedGross, 
        SUM(ft.adjusted_selling_amount) AS adjustedRevenue, 
        "%" AS dataType 
    FROM fact_stockout_sales AS ft 
    JOIN dim_sku AS ds ON ft.sku_key = ds.sku_key 
    WHERE ft.date_key BETWEEN 20080215 AND 20130107 
      AND ft.unit_cost_price BETWEEN 0 AND 1333 
      AND ft.unit_retail_price BETWEEN 0 AND 16500 
      AND ft.store_key IN ("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16") 
      AND ds.product_id IN (1312009,1312007,.. Huge List say 30000) 
      AND ds.category IN ("Male","Female","Unisex") 
      AND ft.day_of_week IN ("1","2","3","4","5","6","7") 
      AND ds.collection_name IN ("Base","SS12","AW12") 
      GROUP BY ds.product_id) AS t 
GROUP BY t.KeyValue

1 个答案:

答案 0 :(得分:0)

  1. 测量运行时间。

  2. 测量为UNION运算符的每个查询运行的时间。

  3. 索引WHERE子句中使用的每个列。

  4. 将最具选择性的列放在WHERE子句中,并测试多列索引的效果。

  5. 削减无用的测试。 (当然可以从WHERE子句中删除ft.day_of_week。)

  6. 重新考虑您的数据类型。一周的日期和商店密钥是否字符串?

  7. 重新考虑一次选择五年数据的决定。

  8. 尝试将产品ID号移动到临时表中,然后加入它。