我遇到了为postgres创建查询的问题(严格来说就是它的redshift) 表格数据如下 该表是PARTITION BY user_id ORDER BY created_at desc
user_id| x | y | min | created_at
-------+---+---+------+---------------------
1| 1 | 1 | 1 | 2015-01-15 17:26:53
1| 1 | 1 | 2 | 2015-01-15 17:26:54
1| 1 | 1 | 3 | 2015-01-15 17:26:55
1| 2 | 1 | 10 | 2015-01-16 02:46:21
1| 1 | 1 | 15 | 2015-01-16 02:46:22
1| 3 | 3 | 11 | 2015-01-16 03:01:44
1| 3 | 3 | 2 | 2015-01-16 03:02:06
2| 1 | 1 | 3 | 2015-01-16 03:02:12
2| 2 | 1 | 4 | 2015-01-16 03:02:15
2| 2 | 1 | 7 | 2015-01-16 03:02:18
我想要的是
user_id| x | y | sum_min |
-------+---+---+----------+
1| 1 | 1 | 6 |
1| 2 | 1 | 10 |
1| 1 | 1 | 15 |
1| 3 | 3 | 13 |
2| 1 | 1 | 3 |
2| 2 | 1 | 11 |
如果我只使用user_id,x,y, 结果将是
user_id| x | y | sum_min |
-------+---+---+----------+
1| 1 | 1 | 21 |
:| : | : | : |
这对我不利:(
答案 0 :(得分:1)
试试这个
with cte as (
select user_id,x,y,created_at,sum(min) over (partition by user_id,x,y,replace order by user_id ) sum_min from (
select user_id,x,y,min,replace( created_at::date::text ,'-',''),created_at from usr order by created_at
)t order by created_at
)
select user_id,x,y,sum_min from cte
group by sum_min,user_id,x,y
order by user_id
答案 1 :(得分:0)
也许尝试按创建日期对其进行分组:
select user_id, x, y, sum(min), created_at::date from test
group by user_id, x, y, created_at::date
order by user_id, x, y, created_at
答案 2 :(得分:0)
似乎您要做的是计算在列上排序的记录簇上的聚合函数,该列基于三列中的相同值,仅与这三列值分开。这在标准SQL中是不可能的,因为记录的顺序与任何SQL命令都无关。按日期排序的事实并没有改变这一点:SQL命令根本不支持这种分层。
我所知道的唯一选项是在您的plpgsql
关系上创建一个cursor
函数data
(可能是一个视图,但对表格同样有效) 。您迭代关系中的所有记录,并且遇到的每个群集总结min
值并输出具有聚类列和总和值的新记录。
CREATE FUNCTION sum_clusters()
RETURNS TABLE (user_id int, x int, y int, sum_int int) AS $$
DECLARE
data_row data%ROWTYPE;
cur CURSOR FOR SELECT * FROM data;
cur_user integer;
cur_x integer;
cur_y integer;
sum integer;
BEGIN
OPEN cur;
FETCH NEXT cur INTO data_row;
LOOP
IF NOT FOUND THEN
EXIT;
END IF;
cur_user := data_row.user_id;
cur_x := data_row.x;
cur_y := data_row.y;
sum := data_row.min;
LOOP
FETCH NEXT cur INTO data_row;
IF NOT FOUND THEN
EXIT;
END IF;
IF (data_row.user_id = cur_user) AND (data_row.x = cur_x) AND (data_row.y = cur_y) THEN
sum += data_row.min;
ELSE
EXIT;
END IF;
END LOOP;
RETURN NEXT cur_user, cur_x, cur_y, sum;
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
这是很多代码而不是特别快,但它应该可以工作。