SQL Server COUNT个表返回错误结果

时间:2018-12-19 15:25:39

标签: sql sql-server

以下是表格:

来自Accounts AS帐户

AccountName
-------------
Account #1
Account #3
Account #2

来自各部门的div和帐户

AccountName DivisionName
----------- ---------------------------
Account #1  Division TWO for Account #1
Account #1  Division ONE for Account #1

从AccountSuppliers中作为acc_sup(多对多联接)

AccountName SupplierName
----------- ------------
Account #1  Supplier #6
Account #1  Supplier #1
Account #1  Supplier #3
Account #2  Supplier #1
Account #2  Supplier #2

以下是查询:

SELECT 
    acc.AccountName,
    COUNT(div.AccountId) AS CountDivisions,
    COUNT(acc_sup.AccountId) AS CountSuppliers 
FROM 
    Account AS acc
LEFT JOIN 
    Division AS div ON (div.AccountId = acc.Id)
LEFT JOIN 
    AccountSupplier AS acc_sup ON (acc_sup.AccountId = acc.Id)
GROUP BY 
    acc.AccountName

结果如下:

AccountName CountDivisions  CountSuppliers
----------- --------------- --------------
Account #1  6               6
Account #2  0               2
Account #3  0               0

应该是:

AccountName CountDivisions  CountSuppliers
----------- --------------- --------------
Account #1  2               3
Account #2  0               2
Account #3  0               0

请注意,添加DISTINCT关键字还如何会产生奇怪的结果:

SELECT 
    acc.AccountName,
    COUNT(DISTINCT div.AccountId) AS CountDivisions,
    COUNT(DISTINCT acc_sup.AccountId) AS CountSuppliers 
FROM 
    Account AS acc
LEFT JOIN 
    Division AS div ON (div.AccountId = acc.Id)
LEFT JOIN 
    AccountSupplier AS acc_sup ON (acc_sup.AccountId = acc.Id)
GROUP BY 
    acc.AccountName

产生:

AccountName CountDivisions  CountSuppliers
----------- --------------- --------------
Account #1  1               1
Account #2  0               1
Account #3  0               0

嗯?我可能忽略了一些简单的内容,但此结果显然是不正确的。有人可以建议我写此查询以获取正确结果的正确方法吗?

谢谢!

4 个答案:

答案 0 :(得分:1)

只需在要计算唯一值的位置添加DISTINCT

SELECT acc.AccountName,
COUNT(DISTINCT div.AccountId) AS CountDivisions,
COUNT(DISTINCT acc_sup.AccountId) AS CountSuppliers 
FROM Account AS acc
LEFT JOIN Division AS div ON (div.AccountId = acc.Id)
LEFT JOIN AccountSupplier AS acc_sup ON (acc_sup.AccountId = acc.Id)
GROUP BY acc.AccountName, div.AccountId, acc_sup.AccountId

答案 1 :(得分:1)

您可以在count状态元中使用不同的关键字

SELECT acc.AccountName,
COUNT(distinct div.AccountId) AS CountDivisions,
COUNT(distinct acc_sup.AccountId) AS CountSuppliers 
FROM Account AS acc
LEFT JOIN Division AS div ON (div.AccountId = acc.Id)
LEFT JOIN AccountSupplier AS acc_sup ON (acc_sup.AccountId = acc.Id)
GROUP BY acc.AccountName

或更节省资源的方式:

 SELECT 
   acc.AccountName,
   (SELECT COUNT(*) FROM Division where div.AccountId = acc.Id) CountDivisions,
   (SELECT COUNT(*) FROM AccountSupplier WHERE acc_sup.AccountId = acc.Id) AS CountSuppliers 
 FROM aCCOUNT AS acc

答案 2 :(得分:0)

查询中导致错误结果的问题是您要求计算错误的字段。

SELECT 
    acc.AccountName,
    COUNT(div.AccountId) AS CountDivisions,
    COUNT(acc_sup.AccountId) AS CountSuppliers 
FROM 
    Account AS acc
LEFT JOIN 
    Division AS div ON (div.AccountId = acc.Id)
LEFT JOIN 
    AccountSupplier AS acc_sup ON (acc_sup.AccountId = acc.Id)
GROUP BY 
    acc.AccountName

当计数更改为COUNT(DISTINCT div.AccountId) AS CountDivisions,时-联接中使用的字段是div.AccountID,因此该组中每一行的值都是相同的-计算那些不同的值当然,无论匹配多少,都是1。

计数应该在子表中的唯一字段上,假设您有一个ID字段,则该字段如下:

SELECT 
    acc.AccountName,
    COUNT(DISTINCT div.Id) AS CountDivisions,
    COUNT(DISTINCT acc_sup.Id) AS CountSuppliers 
FROM 
    Account AS acc
LEFT JOIN 
    Division AS div ON (div.AccountId = acc.Id)
LEFT JOIN 
    AccountSupplier AS acc_sup ON (acc_sup.AccountId = acc.Id)
GROUP BY 
    acc.AccountName

答案 3 :(得分:0)

您非常亲密。您只是错过了从中计算汇总所需的列。

根据您的设置数据,让我们看一下将要使用给定查询的数据。我们要使用LEFT OUTER JOIN,因为即使没有AccountNameDivisionName也要计算所有SupplierName。我们将得到null,将它们转换为0的计数。

所以:

SELECT *
FROM Account acc
LEFT JOIN Division div ON (div.AccountId = acc.Id)
LEFT JOIN AccountSupplier acc_sup ON (acc_sup.AccountId = acc.Id)  ;

给我们:

id | AccountName | accountID | DivisionName | accountID | SupplierName
-: | :---------- | --------: | :----------- | --------: | :-----------
 1 | Acct1       |         1 | Div2         |         1 | Supplier6   
 1 | Acct1       |         1 | Div2         |         1 | Supplier1   
 1 | Acct1       |         1 | Div2         |         1 | Supplier3   
 1 | Acct1       |         1 | Div1         |         1 | Supplier6   
 1 | Acct1       |         1 | Div1         |         1 | Supplier1   
 1 | Acct1       |         1 | Div1         |         1 | Supplier3   
 2 | Acct2       |      null | null         |         2 | Supplier1   
 2 | Acct2       |      null | null         |         2 | Supplier2   
 3 | Acct3       |      null | null         |      null | null        

这样,我们可以验证我们的计数是否正常工作(请记住,null在如何将数学应用于其上有一些限制)。

这样,我们可以看到DivisionName只有两个不同的Acct1,而其他两个都没有。 SupplierName有3个不同的Acct1Acct2有2个,Acct3没有。那有点给我们一个简单的语言解释我们需要做什么。我们需要部门和供应商的唯一名称。

所以:

SELECT 
    acc.AccountName,
    COUNT(DISTINCT div.DivisionName) AS CountDivisions,
    COUNT(DISTINCT acc_sup.SupplierName) AS CountSuppliers 
FROM Account acc
LEFT JOIN Division div ON div.AccountId = acc.Id
LEFT JOIN AccountSupplier acc_sup ON acc_sup.AccountId = acc.Id
GROUP BY acc.AccountName ;

这为我们提供了我们期望的数量:

AccountName | CountDivisions | CountSuppliers
:---------- | -------------: | -------------:
Acct1       |              2 |              3
Acct2       |              0 |              2
Acct3       |              0 |              0

db <>提琴here