我想在MySQL中编写一个窗口函数,该函数提供30天的滚动,计算唯一ID。更准确地说,我的数据库每天有很多条目作为时间戳,包含许多不同的ID。我想每天计算连接多少个不同的ID,还要每天获取过去30天内在线的ID总数。
请考虑下表:
CREATE TABLE `my_database` (
`timestamp` BIGINT(20) UNSIGNED NOT NULL,
`id` VARCHAR(32) NOT NULL);
INSERT INTO my_database (timestamp,id) VALUES (CURDATE(),1);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 1 DAY),2);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 2 DAY),1);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 2 DAY),3);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 29 DAY),4);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 300 DAY),2);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 1000 DAY),5);
外观如下:
timestamp id
20190730 1
20190729 2
20190728 1
20190728 3
20190701 4
20181003 2
20161102 5
我想要得到的结果如下:
date count_day count_30day
2019-07-30 1 4
2019-07-29 1 4
2019-07-28 2 3
2019-07-01 1 1
2018-10-03 1 1
2016-11-02 1 1
我不知道如何获取count_30day列。到目前为止,我已经写了以下内容:
SELECT DATE(a.`timestamp`) AS 'date',
COUNT(DISTINCT a.id) AS 'count_day',
COUNT(DISTINCT a.id) OVER (ORDER BY DATE(a.`timestamp`) ROWS BETWEEN 30 PRECEDING AND CURRENT ROW) AS 'count_30day'
FROM my_database AS a
GROUP
BY DATE(a.`timestamp`)
ORDER
BY DATE(a.`timestamp`) DESC
但是对于count_30day列不起作用。我一直在寻找其他问题,据我所知,窗口函数的文档和语法似乎是正确的,但显然是不正确的,因为这不起作用。如何正确编写窗口函数?除了COUNT(DISTINCT)之外,还有其他更好的方法吗?谢谢!
答案 0 :(得分:0)
ROWS PRECEDING
与行数有关,与天数无关
您需要一个子查询:
SELECT DATE(a.`timestamp`) AS 'date',
COUNT(DISTINCT a.id) AS 'count_day',
MAX( (SELECT COUNT(DISTINCT ID)
FROM my_database db2
WHERE db2.timestamp between DATE_SUB(a.timestamp, INTERVAL 30 DAY)
and a.timestamp
)
) as count30
FROM my_database AS a
GROUP
BY DATE(a.`timestamp`)
ORDER
BY DATE(a.`timestamp`) DESC