优化具有DISTINCT关键字和函数的SQL查询

时间:2018-03-26 14:19:52

标签: sql max distinct

我有这个查询生成大约40,000条记录,这个查询的执行时间大约是1分30秒。

SELECT DISTINCT
a.ID,
a.NAME,
a.DIV,
a.UID,
(select NAME from EMPLOYEE where UID= a.UID and UID<>'') as boss_id, 
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 1 and id = a.ID) as TERM1,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 2 and id = a.ID) as TERM2,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 3 and id = a.ID) as TERM3,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 4 and id = a.ID) as TERM4,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 5 and id = a.ID) as TERM5,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 6 and id = a.ID) as TERM6,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 7 and id = a.ID) as TERM7,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 8 and id = a.ID) as TERM8
FROM EMPLOYEE a
WHERE ID LIKE 'D%'

我尝试使用group by,不同类型的连接来改善执行时间,但无法成功。表ABCXYZ都被编入索引。 此外,我认为此问题的根本原因是DISTINCT关键字或MAX功能。 如何优化上述查询以将执行时间降至至少不到一分钟?

感谢任何帮助。

2 个答案:

答案 0 :(得分:1)

查询未经过测试,这只是一个关于如何以两种不同方式完成此任务的想法。

SQL Server solutions here

  1. 对每个ID使用LEFT JOIN应如下所示:

     SELECT a.ID,
            a.NAME,
            a.DIV,
            a.UID,
            b.Name as boss_id, 
            MAX(xyz1.create_time) as TERM1,
            MAX(xyz2.create_time) as TERM2,
            MAX(xyz3.create_time) as TERM3,
            MAX(xyz4.create_time) as TERM4,
            MAX(xyz5.create_time) as TERM5,
            MAX(xyz6.create_time) as TERM6,
            MAX(xyz7.create_time) as TERM7,
            MAX(xyz8.create_time) as TERM8
    FROM EMPLOYEE a
        JOIN EMPLOYEE b on a.UID = b.UID and b.UID <> ''
        LEFT JOIN XYZ xyz1 on a.ID = xyz1.ID and xyz1.XYZ_ID = 1
        LEFT JOIN XYZ xyz2 on a.ID = xyz2.ID and xyz1.XYZ_ID = 2
        LEFT JOIN XYZ xyz3 on a.ID = xyz3.ID and xyz1.XYZ_ID = 3
        LEFT JOIN XYZ xyz4 on a.ID = xyz4.ID and xyz1.XYZ_ID = 4
        LEFT JOIN XYZ xyz5 on a.ID = xyz5.ID and xyz1.XYZ_ID = 5
        LEFT JOIN XYZ xyz6 on a.ID = xyz6.ID and xyz1.XYZ_ID = 6
        LEFT JOIN XYZ xyz7 on a.ID = xyz7.ID and xyz1.XYZ_ID = 7
        LEFT JOIN XYZ xyz8 on a.ID = xyz8.ID and xyz1.XYZ_ID = 8
    WHERE a.ID LIKE 'D%'
    GROUP BY a.ID, a.NAME, a.DIV, a.UID, b.Name
    
  2. 使用PIVOT看起来像这样:

    select * from (
        SELECT DISTINCT
                    a.ID,
                    a.NAME,
                    a.DIV,
                    a.UID,
                    b.NAME as boss_id,
                    xyz.xyz_id,
                    xyz.create_time
        FROM EMPLOYEE a
            JOIN EMPLOYEE b on a.UID = b.UID and b.UID <> ''
            LEFT JOIN (SELECT DATE(MAX(create_time)) create_time, XYZ_ID, ID 
                       from XYZ 
                       where XYZ_ID between 1 and 8 
                       group by XYZ_ID, ID) xyz on a.ID = xyz1.ID
        WHERE a.ID LIKE 'D%') src
    PIVOT (
        max(create_time) for xyz_id IN (['1'], ['2'], ['3'], ['4'], 
                                        ['5'], ['6'], ['7'], ['8'])
    ) PIV
    
  3. 试一试

答案 1 :(得分:1)

我建议group by和条件聚合:

SELECT e.ID, e.NAME, e.DIV, e.UID,
       DATE(MAX(CASE WHEN XYZ_ID = 1 THEN create_time END)) as term1,
       DATE(MAX(CASE WHEN XYZ_ID = 2 THEN create_time END)) as term2,
       DATE(MAX(CASE WHEN XYZ_ID = 3 THEN create_time END)) as term3,
       DATE(MAX(CASE WHEN XYZ_ID = 4 THEN create_time END)) as term4,
       DATE(MAX(CASE WHEN XYZ_ID = 5 THEN create_time END)) as term5,
       DATE(MAX(CASE WHEN XYZ_ID = 6 THEN create_time END)) as term6,
       DATE(MAX(CASE WHEN XYZ_ID = 7 THEN create_time END)) as term7,
       DATE(MAX(CASE WHEN XYZ_ID = 8 THEN create_time END)) as term8
FROM EMPLOYEE e LEFT JOIN
     XYZ
     ON xyz.ID = e.id
WHERE e.ID LIKE 'D%'
GROUP BY e.ID, e.NAME, e.DIV, e.UID;

我不理解boss_id的逻辑,所以我把它排除在外。这应该会显着提高性能。