最好的方法是什么?
我将在下面介绍,但这里是rextester.com数据的设置,可以使用 - http://rextester.com/PXQOV60475
我有一个名为“案例”的表,其中包含有关维修作业的信息。 Case Manager有一个名为“case_mgr”的字段,还有一个“case_mgr_first”字段,因为Case Manager可以更改,因此它包含原始字段。 “who_last_called”还有另一个类似的字段,即最后联系客户的用户。这些都包含用户名...虽然“case_mgr_first”和“who_last_called”可能为null(“case_mgr_first”是一个新字段,可能没有人调用过。)
要继续维修作业,必须收到要修理的物品。收到后,设置字段“item_received_date”,否则为空。创建记录的日期保存在“created_date”字段中。
因此,目标是以多种方式为用户找到接收%。我希望用户将此接收率作为当前案例管理器(“case_mgr”),“case_mgr_first”,以及“who_last_called”...在“created_date”的特定时间段内找到。
我已经对其中一个进行了查询,并且运行正常。
SELECT c.case_mgr AS case_mgr, COUNT(*) AS count_new,
SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END) AS count_recd,
SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END)*100/COUNT(*) AS percent_recd
FROM cases c WHERE (c.created_date >= '2017-05-01 00:00:00' AND c.created_date <= '2017-05-31 23:59:59')
GROUP BY c.case_mgr ORDER BY c.case_mgr ASC
这给了我一个结果 -
+-----------+-----------+------------+--------------+
| case_mgr | count_new | count_recd | percent_recd |
+-----------+-----------+------------+--------------+
| bamm-bamm | 10 | 4 | 40.00 |
| barney | 105 | 43 | 40.95 |
| betty | 120 | 60 | 50.00 |
| fred | 139 | 54 | 38.85 |
| wilma | 97 | 56 | 57.73 |
+-----------+-----------+------------+--------------+
当我走过“case_mgr_first”时,我会这样做。
SELECT c.case_mgr_first AS case_mgr_first, COUNT(*) AS count_new,
SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END) AS count_recd,
SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END)*100/COUNT(*) AS percent_recd
FROM cases c WHERE (c.created_date >= '2017-05-01 00:00:00' AND c.created_date <= '2017-05-31 23:59:59')
GROUP BY c.case_mgr_first ORDER BY c.case_mgr_first ASC
这给了我一个结果 -
+----------------+-----------+------------+--------------+
| case_mgr_first | count_new | count_recd | percent_recd |
+----------------+-----------+------------+--------------+
| NULL | 137 | 62 | 45.26 |
| barney | 84 | 44 | 52.38 |
| betty | 72 | 37 | 51.39 |
| fred | 116 | 47 | 40.52 |
| wilma | 61 | 19 | 31.15 |
+----------------+-----------+------------+--------------+
(请注意,bamm-bamm出现在第一个结果中但不出现在第二个结果中,并且第二个结果中有一个NULL条目。)
我想要一个看起来像这样的组合结果(我删除了count_new和count_recd以便于阅读) -
+-----------+-----------------------+-----------------------------+
| user | percent_recd_case_mgr | percent_recd_case_mgr_first |
+-----------+-----------------------+-----------------------------+
| NULL | NULL | 45.26 |
| bamm-bamm | 40.00 | NULL |
| barney | 40.95 | 52.38 |
| betty | 50.00 | 51.39 |
| fred | 38.85 | 40.52 |
| wilma | 57.73 | 31.15 |
+-----------+-----------------------+-----------------------------+
我已经非常接近使用子查询并加入这些,并且用户被正确组合,但是问题是LEFT JOIN我错过了第二个查询中没有出现在第一个查询中的结果用户是NULL,并且使用RIGHT JOIN我错过了第一个不在第二个中的结果。此外,它的查询持续时间似乎只是子查询的总和,可能无法改进,我不确定。
这是我试过的查询 -
SELECT * FROM
(
SELECT c.case_mgr AS case_mgr, SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END)*100/COUNT(*) AS percent_recd
FROM cases c WHERE (c.created_date >= '2017-05-01 00:00:00' AND c.created_date <= '2017-05-31 23:59:59')
GROUP BY c.case_mgr
) a
LEFT JOIN
(
SELECT c.case_mgr_first AS case_mgr_first, SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END)*100/COUNT(*) AS percent_recd
FROM cases c WHERE (c.created_date >= '2017-05-01 00:00:00' AND c.created_date <= '2017-05-31 23:59:59')
GROUP BY c.case_mgr_first
) b
ON a.case_mgr = b.case_mgr_first
ORDER BY a.case_mgr ASC
这是结果 -
+-----------+-----------------------+----------------+-----------------------------+
| case_mgr | percent_recd_case_mgr | case_mgr_first | percent_recd_case_mgr_first |
+-----------+-----------------------+----------------+-----------------------------+
| bamm-bamm | 50.00 | NULL | NULL |
| barney | 40.95 | barney | 52.38 |
| betty | 50.00 | betty | 51.39 |
| fred | 38.85 | fred | 40.52 |
| wilma | 57.73 | wilma | 31.15 |
+-----------+-----------------------+----------------+-----------------------------+
我可以使用2个查询执行此操作并将它们组合在代码中,但将它们放在查询中会很好,特别是如果可以某种方式改进性能。
通过更多的阅读,我理解这就像在其他SQL中的FULL OUTER JOIN,并且在MySQL中不存在。它在MySQL中使用LEION JOIN和RIGHT JOIN的UNION进行模拟。好吧,现在我尝试了,它确实有效,但哇哇需要0.92秒(并且添加另一个字段,例如我在开头提到的“who_last_called”)会非常糟糕。每个原始的2个查询大约需要0.22秒,而我第一次尝试JOIN需要花费0.5秒。 “case_mgr”字段有一个索引,但不是“case_mgr_first”。
任何帮助或建议表示赞赏!有没有更好的方法,或者我应该坚持使用单个查询并将它们放在代码中?
答案 0 :(得分:1)
我想你可以做一个解决方案,在那里你做一个UNION子选择并再次分组那个子选择,但它不会很漂亮:
SELECT tmp.mgr,
SUM(tmp.percent_recd_case_mgr) AS percent_recd_case_mgr,
SUM(tmp.percent_recd_case_mgr_first) AS percent_recd_case_mgr
FROM ((
-- this the first part will basically contain the case_mgr data
SELECT
c.case_mgr AS mgr,
SUM(
CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END
)*100/COUNT(*) AS percent_recd_case_mgr,
0 AS percent_recd_case_mgr_first -- 0 as third column
FROM cases c
WHERE
c.created_date >= '2017-05-01 00:00:00'
AND c.created_date <= '2017-05-31 23:59:59'
GROUP BY c.case_mgr ORDER BY c.case_mgr ASC
) UNION (
-- And the second part contains the case_mgr_first data
SELECT
c.case_mgr_first AS mgr,
0 AS percent_recd_case_mgr, -- 0 as second column
SUM(
CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END
)*100/COUNT(*) AS percent_recd_case_mgr_first
FROM cases c
WHERE
c.created_date >= '2017-05-01 00:00:00'
AND c.created_date <= '2017-05-31 23:59:59'
GROUP BY c.case_mgr_first ORDER BY c.case_mgr_first ASC
)) AS tmp -- together both parts form a temp table and we sum again
-- over all records
GROUP BY tmp.mgr
ORDER BY tmp.mgr ASC;