我一直在使用CTE(常用表格表达式)来编写将用于“更大图片”的小型查询。查询。
例如,我将创建一个CTE来计算一列计数。
然后基本上用另一个CTE创建该列的副本。
然后将它们连接在一起,这样我就可以为30天的窗口创建一个滚动总和。
一切似乎都在起作用,我最终得到的表格中有多行具有相同的值,因此总计数(总和)不正确。
这是使用CTE时常见的情况还是我做错了什么来创建重复的行?
我们说我有' table1':
row_number | session_date | user_id | item 1 2017-01-01 alex apple 2 2017-01-02 alex orange
目标是拥有用户购买的项目的滚动总和。
WITH TEMP
AS (
SELECT a.row_number
,a.session_date
,a.user_id
,a.item
,CASE
WHEN a.item = 'apple'
THEN 1
ELSE 0
END AS item_apple
,CASE
WHEN a.item = 'orange'
THEN 1
ELSE 0
END AS item_orange
FROM table1 AS a
)
,duplicate
AS (
SELECT *
FROM TEMP
)
,new_table
AS (
SELECT l.row_number
,r.session_date
,r.user_id
,r.item
,sum(r.item_apple) OVER (
PARTITION BY r.session_date
,r.user_id
) AS sum_apple_30day
,sum(r.item_orange) OVER (
PARTITION BY r.session_date
,r.user_id
) AS sum_orange_30day
FROM TEMP AS l
LEFT JOIN TEMP AS r ON (l.row_number = r.row_number)
LEFT JOIN duplicate dp ON l.user_id = dp.user_id
AND l.session_date <= dp.session_date
AND l.session_date + interval '30 days' > dp.session_date
)
SELECT *
FROM new_table
SQL小提琴:http://sqlfiddle.com/#!17/dade8/2/0
PostgreSQL 9.6
任何帮助将不胜感激!谢谢!
答案 0 :(得分:1)
您可以尝试以下内容。
WITH temp AS
(
SELECT
user_id, item, MIN(session_date) AS FirstDate
FROM table1
GROUP BY user_id, item
),
DateVals AS
(
SELECT
t.*,
(CAST (DATE_PART('day', t.session_date ::timestamp - md.FirstDate ::timestamp) AS INTEGER)) / 30 AS windowValue
FROM table1 AS t INNER JOIN temp AS md ON
t.user_id = md.user_id AND t.item = md.item
)
SELECT
dv.row_number, dv.user_id, dv.item,dv.session_date,
ac.AppleCount , oc.OrangeCount
FROM DateVals AS dv
LEFT OUTER JOIN
(
SELECT
row_number,
ROW_NUMBER() OVER(PARTITION BY user_id,item,windowValue ORDER BY session_date) AS AppleCount
FROM DateVals
WHERE item = 'apple'
)AS ac ON dv.row_number = ac.row_number
LEFT OUTER JOIN
(
SELECT
row_number,
ROW_NUMBER() OVER(PARTITION BY user_id,item,windowValue ORDER BY session_date) AS OrangeCount
FROM DateVals
WHERE item = 'orange'
)AS oc ON dv.row_number = oc.row_number
ORDER BY dv.user_id, dv.item,dv.session_date