MySQL - JOIN条件&绩效问题

时间:2016-02-18 08:25:01

标签: mysql sql

this fiddle中,我在employee_name子句中的task_dateregionJOIN添加了数据过滤条件。在我看来,我在这里做错了,因为在输出中,我看不到过滤的数据。

我希望仅在'2015-01-03'和&日期之间看到数据。 '2015-01-06'而当前查询返回的数据相当于日期过滤器不存在。

此外,是否有更好的方法来编写此查询作为仅用于获取正确SQL的小提琴,而生产环境中的数据值得花费数年时间。查询似乎永远不会完成(它可以运行超过30分钟并且保持快速运行直到被杀死)。如果有人想知道是否可以选择调整数据库,那么数据库就会针对性能进行优化。

任何指针在这里都会非常有用。

输入/输出

enter image description here

CODE

CREATE TABLE ForgeRock
    (`task_date` date, `employee_name` varchar(7), `task_name` varchar(55), `region` varchar(100));
INSERT INTO ForgeRock
    (`task_date`, `employee_name`, `task_name`, `region`)
VALUES
    ('2015-01-01', 'A', 'task A','USA'),
    ('2015-01-01', 'B', 'task B','Russia'),
    ('2015-01-01', 'C', 'task C','USA'),
    ('2015-01-01', 'D', 'task D','USA'),
    ('2015-01-02', 'A', 'task A','Russia'),
    ('2015-01-02', 'B', 'task B','Singapore'),
    ('2015-01-02', 'C', 'task C','USA'),
    ('2015-01-02', 'D', 'task D','USA'),
    ('2015-01-03', 'A', 'task C','Australia'),
    ('2015-01-03', 'B', 'task B','London'),
    ('2015-01-03', 'C', 'task D','USA'),
    ('2015-01-03', 'D', 'task A','USA'),
    ('2015-01-03', 'C', 'task C','London'),
    ('2015-01-04', 'A', 'task B','USA'),
    ('2015-01-04', 'B', 'task A','Singapore'),
    ('2015-01-04', 'C', 'task C','USA'),
    ('2015-01-04', 'D', 'task D','India'),
    ('2015-01-05', 'A', 'task F','USA'),
    ('2015-01-05', 'B', 'task F','USA'),
    ('2015-01-05', 'C', 'task G','China'),
    ('2015-01-05', 'D', 'task B','USA'),
    ('2015-01-06', 'A', 'task Y','USA'),
    ('2015-01-06', 'B', 'task X','USA'),
    ('2015-01-06', 'C', 'task E','USA'),
    ('2015-01-06', 'D', 'task R','USA'),
    ('2015-01-07', 'A', 'task W','China'),
    ('2015-01-07', 'B', 'task O','Russia'),
    ('2015-01-07', 'C', 'task P','USA'),
    ('2015-01-07', 'D', 'task S','London'),
    ('2015-01-07', 'C', 'task E','USA'),
    ('2015-01-08', 'A', 'task E','USA'),
    ('2015-01-08', 'B', 'task W','USA'),
    ('2015-01-08', 'C', 'task C','USA'),
    ('2015-01-08', 'D', 'task B','London');

SQL QUERY

SELECT   task_date, 
         employee_name, 
         Group_concat(task_name) 
FROM     ( 
             SELECT DISTINCT a.task_date, 
                             a.employee_name, 
                             CASE 
                                 WHEN b.employee_name IS NOT NULL
                                     AND c.employee_name IS NULL THEN NULL
                                 ELSE a.task_name
                             END AS task_name 
             FROM forgerock AS a 
             LEFT OUTER JOIN forgerock AS b 
                 ON  a.employee_name = b.employee_name = 'A'
                 AND a.task_date >= '2015-01-03' 
                 AND a.task_date <= '2015-01-06' 
                 AND b.task_date >= '2015-01-03' 
                 AND b.task_date <= '2015-01-06' 
                 AND a.task_date - 1 = b.task_date
                 AND a.region = b.region = 'USA' 
             LEFT OUTER JOIN forgerock AS c 
                 ON  a.employee_name = c.employee_name = 'A'
                 AND a.task_date >= '2015-01-03' 
                 AND a.task_date <= '2015-01-06' 
                 AND c.task_date >= '2015-01-03' 
                 AND c.task_date <= '2015-01-06' 
                 AND a.task_date - 1 = c.task_date
                 AND a.task_name <> c.task_name 
                 AND a.region = c.region = 'USA' 
             ORDER BY a.task_date, 
                      a.employee_name, 
                      a.task_name) AS temp 
GROUP BY task_date, 
         employee_name

2 个答案:

答案 0 :(得分:2)

无论日期如何,联接都在全表“a”上,因此您需要添加where(请参阅**),而无需将其添加到联接中。我不确定,但IS NULL的CASE每次都不起作用,我更喜欢使用合并:

    SELECT   task_date, 
         employee_name, 
         Group_concat(task_name) 
FROM     ( 
                         SELECT DISTINCT a.task_date, 
                                         a.employee_name, 
                                         CASE 
                                                         WHEN b.employee_name IS NOT NULL
                                                                 AND  COALESCE(c.employee_name, '00') THEN '00'
                                                         ELSE a.task_name
                                         END       AS task_name 
                         FROM            forgerock AS a 
                         LEFT OUTER JOIN forgerock AS b 
                         ON              a.employee_name = b.employee_name = 'A'
                         AND             b.task_date >= '2015-01-03' 
                         AND             b.task_date <= '2015-01-06' 
                         AND             a.task_date - 1 = b.task_date
                         AND             a.region = b.region = 'USA' 
                         LEFT OUTER JOIN forgerock AS c 
                         ON              a.employee_name = c.employee_name = 'A'
                         AND             c.task_date >= '2015-01-03' 
                         AND             c.task_date <= '2015-01-06' 
                         AND             a.task_date - 1 = c.task_date
                         AND             a.task_name <> c.task_name 
                         AND             a.region = b.region = 'USA' 
                         **WHERE  a.task_date >= '2015-01-03' AND   a.task_date <= '2015-01-06'**
                         ORDER BY        a.task_date, 
                                         a.employee_name, 
                                         a.task_name) AS temp 
GROUP BY task_date, 
         employee_name

答案 1 :(得分:1)

不确定您尝试使用此查询实现了什么,但所有连接都是左外连接,因此永远不会过滤一个表(来自FROM的表)。您应该始终检查查询计划,查询计划只是在表格中没有说明条件:

enter image description here

简单的修复方法是在表上指定WHERE条件或切换到内部联接。

考虑到性能,您应该再次查看查询计划。如果要过滤没有索引的大表(在查询计划中只看到使用where ,例如:

enter image description here

)这通常是进行更深入调查的标志。