LEFT JOIN计数导致数字“平方”而不是计数?

时间:2019-03-12 09:30:38

标签: sql sql-server database

我有一个重写查询来联接表而不是执行子查询,因为我需要查找10个数字,而10个子查询存在一些性能问题。

为简单起见,更改了表名和列 *

查询以前是这样做的:

SELECT t1.col1, t1.col2, t1.col3, 
(SELECT COUNT(j1.j_id) FROM jointable1 as j1 WHERE t1.t_employee_id = j1.j_employee_id
    AND t1.t_week_ending = j1.j_week_ending AND j1.j_reason <> 'DNC') as col4,
(SELECT COUNT(j2.j_id) FROM jointable1 as j2 WHERE t1.t_employee_id = j2.j_employee_id
    AND t1.t_week_ending = j2.j_week_ending) as col5
FROM table1 as t1
GROUP BY t1.col1, t1.col2, t1.col3;

我已经这样重写它:

SELECT t1.col1, t1.col2, t1.col3, COUNT(j1.j_id) as col4, COUNT(j2.o_id) as col5
FROM table1 as t1
LEFT JOIN jointable1 as j1 ON (t1.t_employee_id = j1.j_employee_id
    AND t1.t_week_ending = j1.j_week_ending)
    AND j1.j_reason = <> 'DNC'
GROUP BY t1.col1, t1.col2, t1.col3;

问题在于,在最上面的示例中,返回col4和col5的值很好。假设他们返回7和8。

+------+------+------+------+--+
| col1 | col2 | col3 | col4 |  |
+------+------+------+------+--+
|    1 |    0 |    0 |   34 |  |
|    0 |    3 |    3 |    9 |  |
|    7 |    1 |    0 |    2 |  |
|    3 |    2 |    2 |    9 |  |
|    4 |    1 |    0 |    4 |  |
|    1 |   11 |    1 |    4 |  |
|    5 |    2 |    5 |   21 |  |
|    2 |    3 |    0 |    3 |  |
|    2 |    3 |    0 |    2 |  |
+------+------+------+------+--+

但在最底端的查询中,它们将返回平方或乘以自身。所以7变成49,而8变成64。

+------+------+------+------+--+
| col1 | col2 | col3 | col4 |  |
+------+------+------+------+--+
|    1 |    0 |    0 | 1156 |  |
|    0 |    3 |    3 |   81 |  |
|    7 |    1 |    0 |   16 |  |
|    3 |    2 |    2 |   81 |  |
|    4 |    1 |    0 |   16 |  |
|    1 |   11 |    1 |   16 |  |
|    5 |    2 |    5 |  441 |  |
|    2 |    3 |    0 |    9 |  |
|    2 |    3 |    0 |    4 |  |
+------+------+------+------+--+

我无法确定这是LEFT JOIN还是GROUP BY函数缺少的内容,但是纠正的任何帮助都是很大的,或者都有助于将重写效率提高到更高的效率方式也很好。

3 个答案:

答案 0 :(得分:2)

如果JOINS中有多个匹配记录,则行数可能会增加,这在使用COUNT之类的聚合函数时可能会给您不正确的结果。您需要将COUNTDISTINCT一起使用,如下所示。

 SELECT   t1.col1, 
          t1.col2, 
          t1.col3, 
          Count(DISTINCT j1.j_id) AS col4, 
          Count(DISTINCT j1.o_id) AS col5 
FROM      table1                  AS t1 
LEFT JOIN jointable1              AS j1 
ON        t1.t_employee_id = j1.j_employee_id 
AND       t1.t_week_ending = j1.j_week_ending 
AND       j1.j_reason = <> 'DNC' 
GROUP BY  t1.col1, 
          t1.col2, 
          t1.col3;

注意:在您的查询中,您使用的别名j2并未在任何地方设置,您需要对其进行适当的更正。

答案 1 :(得分:1)

尝试使用outer apply编写查询。这样会更有效率。另外,您从第二个查询中将无法获得col5的正确计数。您需要计算j_reason的{​​{1}}不是DNC的行数,以及col4的所有行数。

col5

答案 2 :(得分:0)

最好在子查询中进行计数,以计算出所有组合的数量,然后加入这些结果,因为您知道自己只会从每个子查询加入 one 行。

当您以1对多的方式连接到多个表时,就会遇到问题。如果您有两个1-2关联并同时加入,则得到 4 行,而不是 2

SELECT  t1.col1, t1.col2, t1.col3, c4.col4, c5.col5
FROM    table1 as t1
OUTER APPLY
(
    SELECT  COUNT(j1.j_id) col4
    FROM    jointable1 as j1 
    WHERE   t1.t_employee_id = j1.j_employee_id
    AND     t1.t_week_ending = j1.j_week_ending 
    AND     j1.j_reason <> 'DNC'
)c4
OUTER APPLY
(
    SELECT  COUNT(j2.j_id) col5
    FROM    jointable1 as j2 
    WHERE   t1.t_employee_id = j2.j_employee_id
    AND     t1.t_week_ending = j2.j_week_ending
)c5