在1个表

时间:2017-06-15 22:08:19

标签: mysql join full-outer-join

最好的方法是什么?

我将在下面介绍,但这里是rextester.com数据的设置,可以使用 - http://rextester.com/PXQOV60475

我有一个名为“案例”的表,其中包含有关维修作业的信息。 Case Manager有一个名为“case_mgr”的字段,还有一个“case_mgr_first”字段,因为Case Manager可以更改,因此它包含原始字段。 “who_last_called”还有另一个类似的字段,即最后联系客户的用户。这些都包含用户名...虽然“case_mgr_first”和“who_last_called”可能为null(“case_mgr_first”是一个新字段,可能没有人调用过。)

要继续维修作业,必须收到要修理的物品。收到后,设置字段“item_received_date”,否则为空。创建记录的日期保存在“created_date”字段中。

因此,目标是以多种方式为用户找到接收%。我希望用户将此接收率作为当前案例管理器(“case_mgr”),“case_mgr_first”,以及“who_last_called”...在“created_date”的特定时间段内找到。

我已经对其中一个进行了查询,并且运行正常。

SELECT c.case_mgr AS case_mgr, COUNT(*) AS count_new, 
SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END) AS count_recd, 
SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END)*100/COUNT(*) AS percent_recd
FROM cases c WHERE (c.created_date >= '2017-05-01 00:00:00' AND c.created_date <= '2017-05-31 23:59:59') 
GROUP BY c.case_mgr ORDER BY c.case_mgr ASC

这给了我一个结果 -

+-----------+-----------+------------+--------------+
| case_mgr  | count_new | count_recd | percent_recd |
+-----------+-----------+------------+--------------+
| bamm-bamm |        10 |          4 | 40.00        |
| barney    |       105 |         43 | 40.95        |
| betty     |       120 |         60 | 50.00        |
| fred      |       139 |         54 | 38.85        |
| wilma     |        97 |         56 | 57.73        |
+-----------+-----------+------------+--------------+

当我走过“case_mgr_first”时,我会这样做。

SELECT c.case_mgr_first AS case_mgr_first, COUNT(*) AS count_new, 
SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END) AS count_recd, 
SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END)*100/COUNT(*) AS percent_recd 
FROM cases c WHERE (c.created_date >= '2017-05-01 00:00:00' AND c.created_date <= '2017-05-31 23:59:59') 
GROUP BY c.case_mgr_first ORDER BY c.case_mgr_first ASC

这给了我一个结果 -

+----------------+-----------+------------+--------------+
| case_mgr_first | count_new | count_recd | percent_recd |
+----------------+-----------+------------+--------------+
| NULL           |       137 |         62 | 45.26        |
| barney         |        84 |         44 | 52.38        |
| betty          |        72 |         37 | 51.39        |
| fred           |       116 |         47 | 40.52        |
| wilma          |        61 |         19 | 31.15        |
+----------------+-----------+------------+--------------+

(请注意,bamm-bamm出现在第一个结果中但不出现在第二个结果中,并且第二个结果中有一个NULL条目。)

我想要一个看起来像这样的组合结果(我删除了count_new和count_recd以便于阅读) -

+-----------+-----------------------+-----------------------------+
|  user     | percent_recd_case_mgr | percent_recd_case_mgr_first |
+-----------+-----------------------+-----------------------------+
| NULL      | NULL                  | 45.26                       |
| bamm-bamm | 40.00                 | NULL                        |
| barney    | 40.95                 | 52.38                       |
| betty     | 50.00                 | 51.39                       |
| fred      | 38.85                 | 40.52                       |
| wilma     | 57.73                 | 31.15                       |
+-----------+-----------------------+-----------------------------+

我已经非常接近使用子查询并加入这些,并且用户被正确组合,但是问题是LEFT JOIN我错过了第二个查询中没有出现在第一个查询中的结果用户是NULL,并且使用RIGHT JOIN我错过了第一个不在第二个中的结果。此外,它的查询持续时间似乎只是子查询的总和,可能无法改进,我不确定。

这是我试过的查询 -

SELECT * FROM 
(
SELECT c.case_mgr AS case_mgr, SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END)*100/COUNT(*) AS percent_recd
FROM cases c WHERE (c.created_date >= '2017-05-01 00:00:00' AND c.created_date <= '2017-05-31 23:59:59') 
GROUP BY c.case_mgr
) a
LEFT JOIN
(
SELECT c.case_mgr_first AS case_mgr_first, SUM(CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END)*100/COUNT(*) AS percent_recd
FROM cases c WHERE (c.created_date >= '2017-05-01 00:00:00' AND c.created_date <= '2017-05-31 23:59:59') 
GROUP BY c.case_mgr_first 
) b
ON a.case_mgr = b.case_mgr_first 
ORDER BY a.case_mgr ASC

这是结果 -

+-----------+-----------------------+----------------+-----------------------------+
| case_mgr  | percent_recd_case_mgr | case_mgr_first | percent_recd_case_mgr_first |
+-----------+-----------------------+----------------+-----------------------------+
| bamm-bamm | 50.00                 | NULL           | NULL                        |
| barney    | 40.95                 | barney         | 52.38                       |
| betty     | 50.00                 | betty          | 51.39                       |
| fred      | 38.85                 | fred           | 40.52                       |
| wilma     | 57.73                 | wilma          | 31.15                       |
+-----------+-----------------------+----------------+-----------------------------+

我可以使用2个查询执行此操作并将它们组合在代码中,但将它们放在查询中会很好,特别是如果可以某种方式改进性能。

通过更多的阅读,我理解这就像在其他SQL中的FULL OUTER JOIN,并且在MySQL中不存在。它在MySQL中使用LEION JOIN和RIGHT JOIN的UNION进行模拟。好吧,现在我尝试了,它确实有效,但哇哇需要0.92秒(并且添加另一个字段,例如我在开头提到的“who_last_called”)会非常糟糕。每个原始的2个查询大约需要0.22秒,而我第一次尝试JOIN需要花费0.5秒。 “case_mgr”字段有一个索引,但不是“case_mgr_first”。

任何帮助或建议表示赞赏!有没有更好的方法,或者我应该坚持使用单个查询并将它们放在代码中?

1 个答案:

答案 0 :(得分:1)

我想你可以做一个解决方案,在那里你做一个UNION子选择并再次分组那个子选择,但它不会很漂亮:

SELECT tmp.mgr,
  SUM(tmp.percent_recd_case_mgr) AS percent_recd_case_mgr,
  SUM(tmp.percent_recd_case_mgr_first) AS percent_recd_case_mgr
FROM ((
  -- this the first part will basically contain the case_mgr data
  SELECT
    c.case_mgr AS mgr,
    SUM(
      CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END
    )*100/COUNT(*) AS percent_recd_case_mgr,
    0 AS percent_recd_case_mgr_first -- 0 as third column
  FROM cases c
  WHERE
    c.created_date >= '2017-05-01 00:00:00'
    AND c.created_date <= '2017-05-31 23:59:59'
  GROUP BY c.case_mgr ORDER BY c.case_mgr ASC
) UNION (
  -- And the second part contains the case_mgr_first data
  SELECT
    c.case_mgr_first AS mgr,
    0 AS percent_recd_case_mgr, -- 0 as second column
    SUM(
      CASE WHEN c.item_received_date IS NOT NULL THEN 1 ELSE 0 END
    )*100/COUNT(*) AS percent_recd_case_mgr_first
  FROM cases c
  WHERE
    c.created_date >= '2017-05-01 00:00:00'
    AND c.created_date <= '2017-05-31 23:59:59'
  GROUP BY c.case_mgr_first ORDER BY c.case_mgr_first ASC
)) AS tmp -- together both parts form a temp table and we sum again
          -- over all records
GROUP BY tmp.mgr
ORDER BY tmp.mgr ASC;