MySQL:汇总结果不正确但不确定原因?

时间:2016-11-17 20:06:44

标签: mysql

所以我试图找出每个月销售额变化最大的客户(在这种情况下,从6月到7月)。

这是我为此练习创建的模型数据:

mysql> select * from Sales1;
+------------+------------+-----------------+
| CustomerID | mydate     | purchase_amount |
+------------+------------+-----------------+
|         10 | 1996-08-02 |         2540.78 |
|         20 | 1999-01-30 |         1800.54 |
|         30 | 1995-07-14 |          460.33 |
|         10 | 1998-06-29 |            2400 |
|         50 | 1998-02-03 |          600.28 |
|         60 | 1998-03-02 |             720 |
|         10 | 1998-07-06 |             150 |
+------------+------------+-----------------+
mysql> select * from Sales2;
+------------+------------+-----------------+
| CustomerID | mydate     | purchase_amount |
+------------+------------+-----------------+
|         10 | 1996-06-02 |          540.78 |
|         20 | 1999-09-30 |          800.54 |
|         30 | 1995-07-14 |           60.33 |
|         40 | 1998-01-29 |             400 |
|         10 | 1998-07-03 |         2600.28 |
|         60 | 1998-03-02 |            1720 |
|         70 | 1998-05-04 |            4150 |
+------------+------------+-----------------+

根据以上两个表格,答案应该是 CustomerID 10 的客户,以及1998年6月至7月销售额增加350.28

以下是我实现目标的代码;基本上我创建了两个视图,一个包含每年每个客户的所有 JUNE 销售额的总和,另一个包含每个客户的所有 JULY 销售额的总和。年,然后从 JULY 销售中减去 JUNE 销售额:

CREATE VIEW sum6 AS
(
SELECT CustomerID, 
YEAR(mydate) AS year, 
MONTH(mydate) AS month,
SUM(purchase_amount) as amount
FROM Sales1
GROUP BY CustomerID, year, month
HAVING month = 6
) 
UNION ALL (
SELECT CustomerID,
YEAR(mydate) AS year, 
MONTH(mydate) AS month,
SUM(purchase_amount) as amount
FROM Sales2
GROUP BY CustomerID, year, month
HAVING month = 6) 
;

CREATE VIEW sum7 AS
(
SELECT CustomerID, 
YEAR(mydate) AS year, 
MONTH(mydate) AS month,
SUM(purchase_amount) as amount
FROM Sales1
GROUP BY CustomerID, year, month
HAVING month = 7
) 
UNION ALL (
SELECT CustomerID,
YEAR(mydate) AS year, 
MONTH(mydate) AS month,
SUM(purchase_amount) as amount
FROM Sales2
GROUP BY CustomerID, year, month
HAVING month = 7) 
;

SELECT CustomerID, year, (SUM(sum7.amount)-SUM(sum6.amount)) as diff
FROM sum6
JOIN sum7
USING(CustomerID, year)
GROUP BY CustomerID, year
;

但是,我的输出是:

+------------+------+--------------------+
| CustomerID | year | diff               |
+------------+------+--------------------+
|         10 | 1998 | -2049.719970703125 |
+------------+------+--------------------+

虽然是,但CustomerID和年份值是正确的,差异金额不是

我单独检查了sumID和sum7的总和是否由CustomerID和year正确计算:

mysql> SELECT CustomerID, year, SUM(amount)
    -> FROM sum7
    -> GROUP BY CustomerID, year
    -> ;
+------------+------+-------------------+
| CustomerID | year | SUM(amount)       |
+------------+------+-------------------+
|         10 | 1998 | 2750.280029296875 |
|         30 | 1995 | 520.6599884033203 |
+------------+------+-------------------+
mysql> SELECT CustomerID, year, SUM(amount)
    -> FROM sum6
    -> GROUP BY CustomerID, year
    -> ;
+------------+------+------------------+
| CustomerID | year | SUM(amount)      |
+------------+------+------------------+
|         10 | 1996 | 540.780029296875 |
|         10 | 1998 |             2400 |
+------------+------+------------------+

他们是,所以我知道GROUP BY是正确的。

然后我试着查看个别金额:

mysql> SELECT CustomerID, year, SUM(sum7.amount), SUM(sum6.amount)
    -> FROM sum6
    -> JOIN sum7
    -> USING(CustomerID, year)
    -> GROUP BY CustomerID, year
    -> ;
+------------+------+-------------------+------------------+
| CustomerID | year | SUM(sum7.amount)  | SUM(sum6.amount) |
+------------+------+-------------------+------------------+
|         10 | 1998 | 2750.280029296875 |             4800 |
+------------+------+-------------------+------------------+

所以SUM(sum7.amount)是正确的但是SUM(sum6.amount)是不正确的。但是,当他们单独拉出时,他们怎么能正确加起来,并且只有其中一个在组合时总结不正确?这种不一致让我疯狂......

2 个答案:

答案 0 :(得分:2)

您的JOIN语句不完整。

您过于宽松地加入sum6sum7。要使用您的上一个案例,您的JOIN会以某种方式重复记录。 (2400 * 2 = 4800)

当你总计它们时,由于你的连接的设置方式,你会以某种方式从其中一个视图中获取重复记录。您需要检查conditions

帮助缩小范围,包括所有行,不进行数学运算,直到您可以验证数据。从以下开始:

SELECT *
FROM sum6
JOIN sum7
USING(CustomerID, year)

并确认只有要配对的行正在配对,然后从那里开始。

答案 1 :(得分:0)

非常感谢弗里茨,我想出了另一个更简单的解决方案(至少对我而言)。

以下是我的代码,可以毫无问题地实现我的目标:

CREATE VIEW all67 AS
(
SELECT CustomerID, YEAR(mydate) AS year,  MONTH(mydate) AS month, SUM(purchase_amount) AS amount
FROM Sales1
GROUP BY CustomerID, year, month
HAVING month = 6 OR month  = 7 
)
UNION ALL 
(
SELECT CustomerID, YEAR(mydate) AS year,  MONTH(mydate) AS month, SUM(purchase_amount) AS amount
FROM Sales2
GROUP BY CustomerID, year, month
HAVING month = 6 OR month  = 7 
)
;

SELECT CustomerID, year, july.amount - june.amount AS diff
FROM
(
SELECT CustomerID, year, month, SUM(amount) AS amount
FROM all67
GROUP BY CustomerID, year, month
HAVING month = 6
) june
JOIN
(
SELECT CustomerID, year, month, SUM(amount) AS amount
FROM all67
GROUP BY CustomerID, year, month
HAVING month = 7
) july
USING (CustomerID, year)
;

现在终于我的答案出来了! 非常感谢弗里茨。希望我的答案是帮助你们中的许多人提出类似的问题。

干杯!