如何使用SQL中的从属子查询更好地执行查询?

时间:2014-10-15 10:09:49

标签: mysql sql

我正在尝试从两个大型数据表中运行查询。我试图加入他们但同时过滤掉最小日期,过滤日期似乎减慢了ALOT。但这是必须的,有什么方法可以加快它的速度吗?在查询中,它只是不断加载和加载

以下是我在EXPLAIN中的内容

enter image description here

查询是 -

SELECT T1.id_no, 
       T1.condition_code, 
       Count(T1.condition_code) AS COUNT, 
       T1.doe, 
       T2.id_no, 
       T2.trans_time, 
       T2.from_routing_pos 
FROM   attrcoll_month T1 
       JOIN live_trans T2 
         ON T1.id_no = T2.id_no 
WHERE  T2.trans_time = (SELECT Min(trans_time) 
                        FROM   live_trans T2_MIN 
                        WHERE  T2_MIN.id_no = T2.id_no) 
       AND T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59' 
       AND T1.unique_code = 'XXY' 
GROUP  BY T2.from_routing_pos, 
          T1.condition_code 

每个表数据的片段 -

ATTRCOLL_MONTH T1

ID_NO   DOE                CONDITION_CODE   UNQIUE_CODE
8442    25/09/2014 22:49    NEND             XXY
8442    25/09/2014 22:49    SEND             XXY
8442    25/09/2014 22:49    BS               XXY
8442    25/09/2014 22:49    BS               XXY
8442    25/09/2014 22:49    BS               XXY
8442    25/09/2014 22:49    TD               XXY
8511    25/09/2014 22:49    NEND             XXY
8511    25/09/2014 22:49    SEND             XXY
8511    25/09/2014 22:49    BS               XXY
8511    25/09/2014 22:49    BS               XXY
8511    25/09/2014 22:49    BS               XXY
8511    25/09/2014 22:49    TD               XXY
8511    24/09/2014 12:49    OF               XXY
8511    24/09/2014 12:49    OF               XXY
8675    24/09/2014 12:49    NEND             XXY
8675    24/09/2014 12:49    SEND             XXY
9081    24/09/2014 12:49    NEND             XXY

LIVE_TRANS T2

ID_NO   TRANS_TIME  UNQIUE_CODE FROM_ROUTING_POS
8442    2.12276E+17 XXY          OD1
8442    2.12276E+17 XXY          OD2
8445    2.12276E+17 XXY          OD3
8214    2.12276E+17 XXY          OD2
8325    2.12276E+17 XXY          OD1
842     2.12276E+17 XXY          OD3
2444    2.12276E+17 XXY          OD3

对表数据格式化感到抱歉!

希望这个解释得很好,如果您需要更多信息,请告诉我

4 个答案:

答案 0 :(得分:2)

  1. 首先从t1获取记录到临时表。
  2. 然后应用临时表和T2和t2_min的连接并获得所有最小时间和ids
  3. 然后在join中合并#1,#2和t2并应用group by。 这将为性能带来一些提升。
  4. 基本思想是限制将成为Join的一部分的记录并删除子查询。

    这是样本: -

    --Fetch records from Table one based on all filtering conditions
            -- this will reduce the logical read when we apply join
            SELECT
                T1.id_no,
                T1.condition_code,
                T1.doe
            INTO
                #Temp
            FROM
                attrcoll_month T1
            WHERE
                T1.doe >= '01/09/2014'
                AND T1.doe < '03/01/2014'
                AND T1.unique_code = 'XXY';
    
            -- Get all the min time for only required ids. This will avoid the sub query and also read get reduced since records in #temp are
        limited
            SELECT
                MIN(trans_time) MinTime,
                T.id_no
            INTO
                #tempMinTime
            FROM
                #Temp T
                JOIN live_trans T2_MIN ON T.id_no = T2_MIN.id_no;
            --Merging #1 and #2
            SELECT
                T1.id_no,
                T1.condition_code,
                COUNT(T1.condition_code) AS count,
                T1.doe,
                T2.id_no,
                T2.trans_time,
                T2.from_routing_pos
            FROM
                #Temp T1
                JOIN #tempMinTime T ON T1.id_no = T.id_no
                JOIN live_trans T2 ON T.id_no = T2.id_no
            WHERE
                T2.trans_time = T.MinTime
            GROUP BY
                T2.from_routing_pos,
                T1.condition_code;
    

答案 1 :(得分:1)

您正在执行相关子查询,这意味着对于主表(t1)中的每个记录,它在t2内运行查询。您可能希望通过子查询交换它来获取所有ID和最小日期,然后返回t1表以获取其余详细信息。

select
      FT1.id_no, 
      FT1.condition_code, 
      Count(*) AS ConditionCount, 
      FT1.doe, 
      FT2.id_no, 
      FT2.trans_time, 
      FT2.from_routing_pos
   from
      ( select
              t1.id_no,
              min( t2.trans_time ) as MinTime
           from
              attrcoll_month t1
                 JOIN live_trans T2 
                    on t1.id_no = t2.id_no
           where
                  T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59' 
              AND T1.unique_code = 'XXY' 
           group by
              t1.id_no ) as PreQuery
      JOIN attrcoll_month FT1
         on PreQuery.ID_No = FT1.ID_No
      JOIN live_trans FT2 
         ON PreQuery.id_no = FT2.id_no 
        AND PreQuery.MinTime = FT2.trans_time
   group by
      FT2.from_routing_pos, 
      FT1.condition_code 

为了帮助查询,我会在表格上有以下索引

attrcoll_month index = (unique_code, doe, id_no )
attrcoll_month additional index for secondary join = ( id_no, condition_code )

live_trans index = ( id_no, trans_time )

这样,“PreQuery”只获取限定日期/时间的ID并获得最小日期ONCE。然后,由于您拥有ID,只需重新加入以获取其余详细信息。

答案 2 :(得分:1)

查找min()或max()元组可以用NOT EXISTS( lower / higher)来表示:

SELECT T1.id_no 
       , T1.condition_code
       , Count(T1.condition_code) AS COUNT 
       , T1.doe 
       , T2.id_no 
       , T2.trans_time 
       , T2.from_routing_pos
FROM   attrcoll_month T1 
  JOIN live_trans T2      
    ON T1.id_no = T2.id_no      
WHERE  NOT EXISTS (
          SELECT *
          FROM   live_trans T2_MIN               
          WHERE  T2_MIN.id_no = T2.id_no
            AND T2_MIN.trans_time < T2.trans_time
          )    
       AND T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59' 
       AND T1.unique_code = 'XXY' 
GROUP  BY T2.from_routing_pos, T1.condition_code
        ;

答案 3 :(得分:0)

您能否在结果中提供行号:

SELECT T2.*
  FROM attrcoll_month T1 
  JOIN live_trans T2 ON (T1.id_no = T2.id_no)
 WHERE T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59' 
   AND T1.unique_code = 'XXY' 

如果上面的查询产生几行,请先将其选择到临时表中。如果它没有太大影响,也许你应该重新设计你的表。

有些代码可能会更好一些:(已更新)

SELECT T1.id_no, 
       T1.condition_code, 
       Count(T1.condition_code) AS COUNT, 
       T1.doe, 
       T2.id_no, 
       T2.trans_time, 
       T2.from_routing_pos 
FROM   attrcoll_month T1 
JOIN   (SELECT live_trans.id_no, min(live_trans.trans_time) min_trans_time
          FROM live_trans
          JOIN attrcoll_month ON (live_trans.id_no  = live_trans.id_no)
         WHERE attrcoll_month.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59' 
           AND attrcoll_month.unique_code = 'XXY' 
         GROUP BY live_trans.id_no) T2_MIN ON (T1.id_no = T2_MIN.id_no)
JOIN live_trans T2 
         ON (T2.id_no = T2_MIN.id_no AND T2.trans_time = T2_MIN.min_trans_time)
WHERE  T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59' 
  AND T1.unique_code = 'XXY' 
GROUP  BY T2.from_routing_pos, 
          T1.condition_code

如果第一个查询返回的行少得多,则使用Temp Table而不是巨大的T2。另一方面,如果(t2.id_no,t2.trans_time)具有重复记录,则结果可能不正确,因为t2连接t2_min的条件可能创建多行。要确认,请尝试此查询

SELECT t2.id_no, t2.trans_time, count(*) AS my_count
  FROM (
       SELECT t.id_no, min(t.trans_time) as min_trans_time
         FROM live_trans t
        GROUP BY t.id_no) t1
  JOIN live_trans t2 ON (t2.trans_time = t1.min_trans_time)
HAVING my_count > 1

如果查询返回行,请在其他答案中使用NOT EXIST语句,将T2替换为我的答案中的第一个查询作为子查询。