我正在创建answers
表上的SQL报告:
id | created_at
1 | 2018-03-02 18:05:56
2 | 2018-04-02 18:05:56
3 | 2018-04-02 18:05:56
4 | 2018-05-02 18:05:56
5 | 2018-06-02 18:05:56
输出为:
weeks_ago | record_count (# of rows per weekly cohort) | growth (%)
-4 | 21 | 22%
-3 | 22 | -12%
-2 | 32 | 2%
-1 | 2 | 20%
0 | 31 | 0%
我的查询当前带有以下错误:
1111 - Invalid use of group function
我在这里做什么错了?
SELECT floor(datediff(f.created_at, curdate()) / 7) AS weeks_ago,
count(DISTINCT f.id) AS "New Records in Cohort",
100 * (count(*) - lag(count(*), 1) over (order by f.created_at)) / lag(count(*), 1) over (order by f.created_at) || '%' as growth
FROM answers f
WHERE f.completed_at IS NOT NULL
GROUP BY weeks_ago
HAVING count(*) > 1;
答案 0 :(得分:1)
您不能使用lag
包含COUNT
聚合函数,因为使用聚合函数包含聚合函数时无效。
您可以尝试使用子查询来实现它。
SELECT weeks_ago,
NewRecords "New Records in Cohort",
100 * (cnt - lag(cnt, 1) over (order by created_at)) / lag(cnt, 1) over (order by created_at) || '%' as growth
FROM (
SELECT floor(datediff(f.created_at, curdate()) / 7) AS weeks_ago,
COUNT(*) over(partition by weeks_ago order by weeks_ago) cnt,
count(DISTINCT f.id) NewRecords,
f.created_at
FROM answers f
) t1
答案 1 :(得分:1)
我认为您想查找当前行的所有行 的运行计数。我认为您可以按以下方式放弃LAG
函数:
SELECT
COUNT(*) OVER (ORDER BY f.created_at ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) x, -- running count before current row
COUNT(*) OVER (ORDER BY f.created_at) y -- running count including current row
您可以除以所需的所有内容。
不。您只需将GROUP BY
和LAG OVER
分开:
WITH cte AS (
SELECT
FLOOR(DATEDIFF(created_at, CURDATE()) / 7) AS weeks_ago,
COUNT(DISTINCT id) AS new_records
FROM answers
WHERE 1 = 1 -- todo: change this
GROUP BY weeks_ago
HAVING 1 = 1 -- todo: change this
)
SELECT
cte.*,
100 * (
new_records - LAG(new_records) OVER (ORDER BY weeks_ago)
) / LAG(new_records) OVER (ORDER BY weeks_ago) AS percent_increase
FROM cte