我有一个重写查询来联接表而不是执行子查询,因为我需要查找10个数字,而10个子查询存在一些性能问题。
为简单起见,更改了表名和列 *
查询以前是这样做的:
SELECT t1.col1, t1.col2, t1.col3,
(SELECT COUNT(j1.j_id) FROM jointable1 as j1 WHERE t1.t_employee_id = j1.j_employee_id
AND t1.t_week_ending = j1.j_week_ending AND j1.j_reason <> 'DNC') as col4,
(SELECT COUNT(j2.j_id) FROM jointable1 as j2 WHERE t1.t_employee_id = j2.j_employee_id
AND t1.t_week_ending = j2.j_week_ending) as col5
FROM table1 as t1
GROUP BY t1.col1, t1.col2, t1.col3;
我已经这样重写它:
SELECT t1.col1, t1.col2, t1.col3, COUNT(j1.j_id) as col4, COUNT(j2.o_id) as col5
FROM table1 as t1
LEFT JOIN jointable1 as j1 ON (t1.t_employee_id = j1.j_employee_id
AND t1.t_week_ending = j1.j_week_ending)
AND j1.j_reason = <> 'DNC'
GROUP BY t1.col1, t1.col2, t1.col3;
问题在于,在最上面的示例中,返回col4和col5的值很好。假设他们返回7和8。
+------+------+------+------+--+
| col1 | col2 | col3 | col4 | |
+------+------+------+------+--+
| 1 | 0 | 0 | 34 | |
| 0 | 3 | 3 | 9 | |
| 7 | 1 | 0 | 2 | |
| 3 | 2 | 2 | 9 | |
| 4 | 1 | 0 | 4 | |
| 1 | 11 | 1 | 4 | |
| 5 | 2 | 5 | 21 | |
| 2 | 3 | 0 | 3 | |
| 2 | 3 | 0 | 2 | |
+------+------+------+------+--+
但在最底端的查询中,它们将返回平方或乘以自身。所以7变成49,而8变成64。
+------+------+------+------+--+
| col1 | col2 | col3 | col4 | |
+------+------+------+------+--+
| 1 | 0 | 0 | 1156 | |
| 0 | 3 | 3 | 81 | |
| 7 | 1 | 0 | 16 | |
| 3 | 2 | 2 | 81 | |
| 4 | 1 | 0 | 16 | |
| 1 | 11 | 1 | 16 | |
| 5 | 2 | 5 | 441 | |
| 2 | 3 | 0 | 9 | |
| 2 | 3 | 0 | 4 | |
+------+------+------+------+--+
我无法确定这是LEFT JOIN还是GROUP BY函数缺少的内容,但是纠正的任何帮助都是很大的,或者或都有助于将重写效率提高到更高的效率方式也很好。
答案 0 :(得分:2)
如果JOINS
中有多个匹配记录,则行数可能会增加,这在使用COUNT
之类的聚合函数时可能会给您不正确的结果。您需要将COUNT
与DISTINCT
一起使用,如下所示。
SELECT t1.col1,
t1.col2,
t1.col3,
Count(DISTINCT j1.j_id) AS col4,
Count(DISTINCT j1.o_id) AS col5
FROM table1 AS t1
LEFT JOIN jointable1 AS j1
ON t1.t_employee_id = j1.j_employee_id
AND t1.t_week_ending = j1.j_week_ending
AND j1.j_reason = <> 'DNC'
GROUP BY t1.col1,
t1.col2,
t1.col3;
注意:在您的查询中,您使用的别名j2
并未在任何地方设置,您需要对其进行适当的更正。
答案 1 :(得分:1)
尝试使用outer apply
编写查询。这样会更有效率。另外,您从第二个查询中将无法获得col5
的正确计数。您需要计算j_reason
的{{1}}不是DNC
的行数,以及col4
的所有行数。
col5
答案 2 :(得分:0)
最好在子查询中进行计数,以计算出所有组合的数量,然后加入这些结果,因为您知道自己只会从每个子查询加入 one 行。
当您以1对多的方式连接到多个表时,就会遇到问题。如果您有两个1-2关联并同时加入,则得到 4 行,而不是 2 。
SELECT t1.col1, t1.col2, t1.col3, c4.col4, c5.col5
FROM table1 as t1
OUTER APPLY
(
SELECT COUNT(j1.j_id) col4
FROM jointable1 as j1
WHERE t1.t_employee_id = j1.j_employee_id
AND t1.t_week_ending = j1.j_week_ending
AND j1.j_reason <> 'DNC'
)c4
OUTER APPLY
(
SELECT COUNT(j2.j_id) col5
FROM jointable1 as j2
WHERE t1.t_employee_id = j2.j_employee_id
AND t1.t_week_ending = j2.j_week_ending
)c5