按月份和年份分组会给出错误的答案

时间:2019-06-21 22:30:44

标签: mysql sql

我创建了一个表,并将值插入到数据库中,如下所示:

root

我想计算两年内每月CREATE TABLE task ( date DATE, total_rides INT ); INSERT INTO TASK VALUES(('2011-01-01'), 985); INSERT INTO TASK VALUES(('2011-01-02'), 801); INSERT INTO TASK VALUES(('2011-01-03'), 1349); INSERT INTO TASK VALUES(('2011-01-04'), 1562); INSERT INTO TASK VALUES(('2011-01-05'), 1600); INSERT INTO TASK VALUES(('2011-01-06'), 1606); INSERT INTO TASK VALUES(('2011-01-07'), 1510); INSERT INTO TASK VALUES(('2011-01-08'), 959); INSERT INTO TASK VALUES(('2011-01-09'), 822); INSERT INTO TASK VALUES(('2011-01-10'), 1321); INSERT INTO TASK VALUES(('2011-01-11'), 1263); INSERT INTO TASK VALUES(('2011-01-12'), 1162); INSERT INTO TASK VALUES(('2011-01-13'), 1406); INSERT INTO TASK VALUES(('2011-01-14'), 1421); INSERT INTO TASK VALUES(('2011-01-15'), 1248); INSERT INTO TASK VALUES(('2011-01-16'), 1204); INSERT INTO TASK VALUES(('2011-01-17'), 1000); INSERT INTO TASK VALUES(('2011-01-18'), 683); INSERT INTO TASK VALUES(('2011-01-19'), 1650); INSERT INTO TASK VALUES(('2011-01-20'), 1927); INSERT INTO TASK VALUES(('2011-01-21'), 1543); INSERT INTO TASK VALUES(('2011-01-22'), 981); INSERT INTO TASK VALUES(('2011-01-23'), 986); INSERT INTO TASK VALUES(('2011-01-24'), 1416); INSERT INTO TASK VALUES(('2011-01-25'), 1985); INSERT INTO TASK VALUES(('2011-01-26'), 506); INSERT INTO TASK VALUES(('2011-01-27'), 431); INSERT INTO TASK VALUES(('2011-01-28'), 1167); INSERT INTO TASK VALUES(('2011-01-29'), 1098); INSERT INTO TASK VALUES(('2011-01-30'), 1096); INSERT INTO TASK VALUES(('2011-01-31'), 1501); INSERT INTO TASK VALUES(('2011-02-01'), 1360); INSERT INTO TASK VALUES(('2011-02-02'), 1526); INSERT INTO TASK VALUES(('2011-02-03'), 1550); INSERT INTO TASK VALUES(('2011-02-04'), 1708); INSERT INTO TASK VALUES(('2011-02-05'), 1005); INSERT INTO TASK VALUES(('2011-02-06'), 1623); INSERT INTO TASK VALUES(('2011-02-07'), 1712); INSERT INTO TASK VALUES(('2011-02-08'), 1530); INSERT INTO TASK VALUES(('2011-02-09'), 1605); INSERT INTO TASK VALUES(('2011-02-10'), 1538); INSERT INTO TASK VALUES(('2011-02-11'), 1746); INSERT INTO TASK VALUES(('2011-02-12'), 1472); INSERT INTO TASK VALUES(('2011-02-13'), 1589); INSERT INTO TASK VALUES(('2011-02-14'), 1913); INSERT INTO TASK VALUES(('2011-02-15'), 1815); INSERT INTO TASK VALUES(('2011-02-16'), 2115); INSERT INTO TASK VALUES(('2011-02-17'), 2475); INSERT INTO TASK VALUES(('2011-02-18'), 2927); INSERT INTO TASK VALUES(('2011-02-19'), 1635); INSERT INTO TASK VALUES(('2011-02-20'), 1812); INSERT INTO TASK VALUES(('2011-02-21'), 1107); INSERT INTO TASK VALUES(('2011-02-22'), 1450); INSERT INTO TASK VALUES(('2011-02-23'), 1917); INSERT INTO TASK VALUES(('2011-02-24'), 1807); INSERT INTO TASK VALUES(('2011-02-25'), 1461); INSERT INTO TASK VALUES(('2011-02-26'), 1969); INSERT INTO TASK VALUES(('2011-02-27'), 2402); INSERT INTO TASK VALUES(('2011-02-28'), 1446); INSERT INTO TASK VALUES(('2012-01-01'), 2294); INSERT INTO TASK VALUES(('2012-01-02'), 1951); INSERT INTO TASK VALUES(('2012-01-03'), 2236); INSERT INTO TASK VALUES(('2012-01-04'), 2368); INSERT INTO TASK VALUES(('2012-01-05'), 3272); INSERT INTO TASK VALUES(('2012-01-06'), 4098); INSERT INTO TASK VALUES(('2012-01-07'), 4521); INSERT INTO TASK VALUES(('2012-01-08'), 3425); INSERT INTO TASK VALUES(('2012-01-09'), 2376); INSERT INTO TASK VALUES(('2012-01-10'), 3598); INSERT INTO TASK VALUES(('2012-01-11'), 2177); INSERT INTO TASK VALUES(('2012-01-12'), 4097); INSERT INTO TASK VALUES(('2012-01-13'), 3214); INSERT INTO TASK VALUES(('2012-01-14'), 2493); INSERT INTO TASK VALUES(('2012-01-15'), 2311); INSERT INTO TASK VALUES(('2012-01-16'), 2298); INSERT INTO TASK VALUES(('2012-01-17'), 2935); INSERT INTO TASK VALUES(('2012-01-18'), 3376); INSERT INTO TASK VALUES(('2012-01-19'), 3292); INSERT INTO TASK VALUES(('2012-01-20'), 3163); INSERT INTO TASK VALUES(('2012-01-21'), 1301); INSERT INTO TASK VALUES(('2012-01-22'), 1977); INSERT INTO TASK VALUES(('2012-01-23'), 2432); INSERT INTO TASK VALUES(('2012-01-24'), 4339); INSERT INTO TASK VALUES(('2012-01-25'), 4270); INSERT INTO TASK VALUES(('2012-01-26'), 4075); INSERT INTO TASK VALUES(('2012-01-27'), 3456); INSERT INTO TASK VALUES(('2012-01-28'), 4023); INSERT INTO TASK VALUES(('2012-01-29'), 3243); INSERT INTO TASK VALUES(('2012-01-30'), 3624); INSERT INTO TASK VALUES(('2012-01-31'), 4509); 的每日共享单车数量以及每月average的共享单车数量,为此,我写了一个查询:

variance

它给我的输出是:

SELECT MONTH(DATE) AS mon, YEAR(date) AS Yr, AVG(task.total_rides) AS Average, std(task.total_rides) AS stdev, VARIANCE(task.total_rides) AS Var
FROM task
GROUP BY CAST(MONTH(task.date) AS VARCHAR(2)) + '-' + CAST(YEAR(task.date) AS VARCHAR(4));

只有2011年1月的第一个结果正确。

与其分别计算2011年2月和2012年1月的平均,标准差和方差,而是将输出显示为:

预期输出:

mon  Yr        Average        stdev          Var
1    2,011    1231.9032     366.3764      134231.7003
2    2,011    2456.9322     973.6375      947969.9615

它将所有属于2011年2月和2012年1月的值组合在一起,然后计算平均值,标准差和变量。

知道我在做什么错吗?

预先感谢

2 个答案:

答案 0 :(得分:2)

您需要使用单独的月份和年份部分进行分组,或者使用适当的语法进行串联:

GROUP BY CONCAT(CAST(MONTH(task.date) AS VARCHAR(2)), '-', CAST(YEAR(task.date) AS VARCHAR(4)))

GROUP BY MONTH(task.date), YEAR(task.date)

首选使用后者,因为它适用于数字数据,使用您使用的结果(因此适用于严格的严格设置),并且您在其他任何地方都不使用串联字符串。

您的方法行不通的原因是+不适用于串联字符串。这是补充。而且由于MySQL假定任何字符串都可以转换为数字,因此不会导致错误。因此,您的查询将2011 + 0 + 2和2012 + 0 + 1都计算为2013,并将它们分组在一起。

MySQL隐式转换为数字的方式是从一开始就采用尽可能多的数字字符。因此,“ 123abc”等于123,而“-”为零,因为开头没有数字字符。

答案 1 :(得分:1)

我在评论中提到的首选方法是将年份和月份分别添加到您的GROUP BY

SELECT MONTH(date) AS mon,
       YEAR(date) AS Yr,
       AVG(task.total_rides) AS Average,
       STD(task.total_rides) AS stdev,
       VARIANCE(task.total_rides) AS Var
FROM task
GROUP BY MONTH(date),
         YEAR(date);

通常,您希望在GROUP BY子句中包括每个未聚合的列。您可以避免使用(explained here)的一些例外,但这会使您的代码可读性较低,并且对其他DBMS的可移植性也较低。