使用多个左连接来计算平均值和计数

时间:2009-10-10 01:40:55

标签: sql left-join

我试图弄清楚如何使用多个左外连接来计算平均分数和卡数。我有以下架构和测试数据。每个牌组有0或更多分数和0或更多牌。我需要计算每个牌组的平均得分和卡数。我使用mysql是为了方便,我最终希望在Android手机上运行sqlite。

mysql> select * from deck;
+----+-------+
| id | name  |
+----+-------+
|  1 | one   | 
|  2 | two   | 
|  3 | three | 
+----+-------+
mysql> select * from score;
+---------+-------+---------------------+--------+
| scoreId | value | date                | deckId |
+---------+-------+---------------------+--------+
|       1 |  6.58 | 2009-10-05 20:54:52 |      1 | 
|       2 |     7 | 2009-10-05 20:54:58 |      1 | 
|       3 |  4.67 | 2009-10-05 20:55:04 |      1 | 
|       4 |     7 | 2009-10-05 20:57:38 |      2 | 
|       5 |     7 | 2009-10-05 20:57:41 |      2 | 
+---------+-------+---------------------+--------+
mysql> select * from card;
+--------+-------+------+--------+
| cardId | front | back | deckId |
+--------+-------+------+--------+
|      1 | fron  | back |      2 | 
|      2 | fron  | back |      1 | 
|      3 | f1    | b2   |      1 | 
+--------+-------+------+--------+

我运行以下查询...


mysql> select deck.name, sum(score.value)/count(score.value) "Ave", 
    ->   count(card.front) "Count" 
    -> from deck 
    -> left outer join score on deck.id=score.deckId 
    -> left outer join card on deck.id=card.deckId
    -> group by deck.id;

+-------+-----------------+-------+
| name  | Ave             | Count |
+-------+-----------------+-------+
| one   | 6.0833333333333 |     6 | 
| two   |               7 |     2 | 
| three |            NULL |     0 | 
+-------+-----------------+-------+

...我得到了平均值的正确答案,但卡的数量是错误的答案。在我拔头发之前,有人能告诉我我做错了什么吗?

谢谢!

约翰

4 个答案:

答案 0 :(得分:1)

它正在运行您所要求的 - 它将卡2和3加入到分数1,2和3中 - 创建计数为6(2 * 3)。在牌1的情况下,它连接到得分4和5,创建一个2(1 * 2)的计数。

如果您只想要一张卡片,就像您目前正在做的那样,那就是COUNT(Distinct Card.CardId)。

答案 1 :(得分:1)

select deck.name, coalesce(x.ave,0) as ave, count(card.*) as count -- card.* makes the intent more clear, i.e. to counting card itself, not the field.  but do not do count(*), will make the result wrong
from deck    
left join -- flatten the average result rows first
(
    select deckId,sum(value)/count(*) as ave -- count the number of rows, not count the column name value.  intent is more clear
    from score 
    group by deckId
) as x on x.deckId = deck.id
left outer join card on card.deckId = deck.id -- then join the flattened results to cards
group by deck.id, x.ave, deck.name
order by deck.id

<强> [编辑]

sql具有内置的平均功能,只需使用:

select deckId, avg(value) as ave
from score 
group by deckId

答案 2 :(得分:1)

出现问题的原因是您在scorecard之间创建Cartesian product

以下是它的工作原理:当您将deck加入score时,您可能会有多行匹配。然后,这些多行中的每个都会连接到card中匹配行的所有。没有条件阻止这种情况发生,并且当没有条件限制它时的默认连接行为是将一个表中的所有行连接到另一个表中的所有行。

要查看它的实际效果,请尝试此查询,不要使用以下组:

select * 
from deck 
left outer join score on deck.id=score.deckId 
left outer join card on deck.id=card.deckId;

您会在scorecard列中看到大量重复数据。当您计算AVG()在其中重复的数据时,冗余值会神奇地消失(只要值重复均匀)。但是当你COUNT()SUM()时,总数就会偏离。

对于无意的笛卡尔积可能有补救措施。在您的情况下,您可以使用COUNT(DISTINCT)来补偿:

select deck.name, avg(score.value) "Ave", count(DISTINCT card.front) "Count" 
from deck 
left outer join score on deck.id=score.deckId 
left outer join card on deck.id=card.deckId
group by deck.id;

此解决方案无法解决无意中笛卡尔积的所有情况。更通用的解决方案是将其分解为两个单独的查询:

select deck.name, avg(score.value) "Ave"
from deck 
left outer join score on deck.id=score.deckId 
group by deck.id;

select deck.name, count(card.front) "Count" 
from deck 
left outer join card on deck.id=card.deckId
group by deck.id;

并非数据库编程中的每个任务都必须在单个查询中完成。当您需要多个统计信息时,它甚至可以更多高效(以及更简单,更容易修改,更不容易出错)来使用单个查询。

答案 3 :(得分:0)

在我看来,使用左连接不是一个好方法。这是您想要的结果的标准SQL查询。

select
  name,
  (select avg(value) from score where score.deckId = deck.id) as Ave,
  (select count(*) from card where card.deckId = deck.id) as "Count"
from deck;