我正在尝试从两个大型数据表中运行查询。我试图加入他们但同时过滤掉最小日期,过滤日期似乎减慢了ALOT。但这是必须的,有什么方法可以加快它的速度吗?在查询中,它只是不断加载和加载
以下是我在EXPLAIN中的内容
查询是 -
SELECT T1.id_no,
T1.condition_code,
Count(T1.condition_code) AS COUNT,
T1.doe,
T2.id_no,
T2.trans_time,
T2.from_routing_pos
FROM attrcoll_month T1
JOIN live_trans T2
ON T1.id_no = T2.id_no
WHERE T2.trans_time = (SELECT Min(trans_time)
FROM live_trans T2_MIN
WHERE T2_MIN.id_no = T2.id_no)
AND T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59'
AND T1.unique_code = 'XXY'
GROUP BY T2.from_routing_pos,
T1.condition_code
每个表数据的片段 -
ATTRCOLL_MONTH T1
ID_NO DOE CONDITION_CODE UNQIUE_CODE
8442 25/09/2014 22:49 NEND XXY
8442 25/09/2014 22:49 SEND XXY
8442 25/09/2014 22:49 BS XXY
8442 25/09/2014 22:49 BS XXY
8442 25/09/2014 22:49 BS XXY
8442 25/09/2014 22:49 TD XXY
8511 25/09/2014 22:49 NEND XXY
8511 25/09/2014 22:49 SEND XXY
8511 25/09/2014 22:49 BS XXY
8511 25/09/2014 22:49 BS XXY
8511 25/09/2014 22:49 BS XXY
8511 25/09/2014 22:49 TD XXY
8511 24/09/2014 12:49 OF XXY
8511 24/09/2014 12:49 OF XXY
8675 24/09/2014 12:49 NEND XXY
8675 24/09/2014 12:49 SEND XXY
9081 24/09/2014 12:49 NEND XXY
LIVE_TRANS T2
ID_NO TRANS_TIME UNQIUE_CODE FROM_ROUTING_POS
8442 2.12276E+17 XXY OD1
8442 2.12276E+17 XXY OD2
8445 2.12276E+17 XXY OD3
8214 2.12276E+17 XXY OD2
8325 2.12276E+17 XXY OD1
842 2.12276E+17 XXY OD3
2444 2.12276E+17 XXY OD3
对表数据格式化感到抱歉!
希望这个解释得很好,如果您需要更多信息,请告诉我
答案 0 :(得分:2)
基本思想是限制将成为Join的一部分的记录并删除子查询。
这是样本: -
--Fetch records from Table one based on all filtering conditions
-- this will reduce the logical read when we apply join
SELECT
T1.id_no,
T1.condition_code,
T1.doe
INTO
#Temp
FROM
attrcoll_month T1
WHERE
T1.doe >= '01/09/2014'
AND T1.doe < '03/01/2014'
AND T1.unique_code = 'XXY';
-- Get all the min time for only required ids. This will avoid the sub query and also read get reduced since records in #temp are
limited
SELECT
MIN(trans_time) MinTime,
T.id_no
INTO
#tempMinTime
FROM
#Temp T
JOIN live_trans T2_MIN ON T.id_no = T2_MIN.id_no;
--Merging #1 and #2
SELECT
T1.id_no,
T1.condition_code,
COUNT(T1.condition_code) AS count,
T1.doe,
T2.id_no,
T2.trans_time,
T2.from_routing_pos
FROM
#Temp T1
JOIN #tempMinTime T ON T1.id_no = T.id_no
JOIN live_trans T2 ON T.id_no = T2.id_no
WHERE
T2.trans_time = T.MinTime
GROUP BY
T2.from_routing_pos,
T1.condition_code;
答案 1 :(得分:1)
您正在执行相关子查询,这意味着对于主表(t1)中的每个记录,它在t2内运行查询。您可能希望通过子查询交换它来获取所有ID和最小日期,然后返回t1表以获取其余详细信息。
select
FT1.id_no,
FT1.condition_code,
Count(*) AS ConditionCount,
FT1.doe,
FT2.id_no,
FT2.trans_time,
FT2.from_routing_pos
from
( select
t1.id_no,
min( t2.trans_time ) as MinTime
from
attrcoll_month t1
JOIN live_trans T2
on t1.id_no = t2.id_no
where
T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59'
AND T1.unique_code = 'XXY'
group by
t1.id_no ) as PreQuery
JOIN attrcoll_month FT1
on PreQuery.ID_No = FT1.ID_No
JOIN live_trans FT2
ON PreQuery.id_no = FT2.id_no
AND PreQuery.MinTime = FT2.trans_time
group by
FT2.from_routing_pos,
FT1.condition_code
为了帮助查询,我会在表格上有以下索引
attrcoll_month index = (unique_code, doe, id_no )
attrcoll_month additional index for secondary join = ( id_no, condition_code )
live_trans index = ( id_no, trans_time )
这样,“PreQuery”只获取限定日期/时间的ID并获得最小日期ONCE。然后,由于您拥有ID,只需重新加入以获取其余详细信息。
答案 2 :(得分:1)
查找min()或max()元组可以用NOT EXISTS( lower / higher)
来表示:
SELECT T1.id_no
, T1.condition_code
, Count(T1.condition_code) AS COUNT
, T1.doe
, T2.id_no
, T2.trans_time
, T2.from_routing_pos
FROM attrcoll_month T1
JOIN live_trans T2
ON T1.id_no = T2.id_no
WHERE NOT EXISTS (
SELECT *
FROM live_trans T2_MIN
WHERE T2_MIN.id_no = T2.id_no
AND T2_MIN.trans_time < T2.trans_time
)
AND T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59'
AND T1.unique_code = 'XXY'
GROUP BY T2.from_routing_pos, T1.condition_code
;
答案 3 :(得分:0)
您能否在结果中提供行号:
SELECT T2.*
FROM attrcoll_month T1
JOIN live_trans T2 ON (T1.id_no = T2.id_no)
WHERE T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59'
AND T1.unique_code = 'XXY'
如果上面的查询产生几行,请先将其选择到临时表中。如果它没有太大影响,也许你应该重新设计你的表。
有些代码可能会更好一些:(已更新)
SELECT T1.id_no,
T1.condition_code,
Count(T1.condition_code) AS COUNT,
T1.doe,
T2.id_no,
T2.trans_time,
T2.from_routing_pos
FROM attrcoll_month T1
JOIN (SELECT live_trans.id_no, min(live_trans.trans_time) min_trans_time
FROM live_trans
JOIN attrcoll_month ON (live_trans.id_no = live_trans.id_no)
WHERE attrcoll_month.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59'
AND attrcoll_month.unique_code = 'XXY'
GROUP BY live_trans.id_no) T2_MIN ON (T1.id_no = T2_MIN.id_no)
JOIN live_trans T2
ON (T2.id_no = T2_MIN.id_no AND T2.trans_time = T2_MIN.min_trans_time)
WHERE T1.doe BETWEEN '2014-09-01 00:00:01' AND '2014-09-02 23:59:59'
AND T1.unique_code = 'XXY'
GROUP BY T2.from_routing_pos,
T1.condition_code
如果第一个查询返回的行少得多,则使用Temp Table而不是巨大的T2。另一方面,如果(t2.id_no,t2.trans_time)具有重复记录,则结果可能不正确,因为t2连接t2_min的条件可能创建多行。要确认,请尝试此查询
SELECT t2.id_no, t2.trans_time, count(*) AS my_count
FROM (
SELECT t.id_no, min(t.trans_time) as min_trans_time
FROM live_trans t
GROUP BY t.id_no) t1
JOIN live_trans t2 ON (t2.trans_time = t1.min_trans_time)
HAVING my_count > 1
如果查询返回行,请在其他答案中使用NOT EXIST语句,将T2替换为我的答案中的第一个查询作为子查询。