我有一个表,其中包含我们客户的订单: order_date:订单日期(这不是唯一的,因为每个订单可能有多个产品) customer_id:这不是唯一的
我想计算直到当前order_date的每个客户的订单数量,但是由于order_date中有重复项,因此结果不合理。
我在Postgres 11.2中使用窗口函数
CREATE TABLE "public"."orders" (
"order_date" timestamp,
"customer_id" integer
);
插入数据:
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-12-13 20:45:24.571964', 402) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-12-13 20:45:24.571964', 402) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-10-12 20:08:39.635959', 466) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-11-04 22:15:14.905851', 483) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-11-04 22:15:14.905851', 483) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id")
我使用此代码生成了我想要的东西,但它不起作用
select *,COALESCE(COUNT(*) OVER (partition by orders.customer_id order by orders.order_date range between interval '100 years' PRECEDING AND
CURRENT ROW EXCLUDE CURRENT ROW),0) AS
customer_orders_count_up_to_now,
COALESCE(COUNT(*) OVER (partition by orders.customer_id order by
orders.order_date asc range BETWEEN interval '7 days' PRECEDING
AND CURRENT ROW EXCLUDE CURRENT ROW),0) AS
customer_orders_last_seven_days
from orders
我希望customer_orders_count_up_to_now和customer_orders_last_seven_days的输出列为0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0 但是,由于order_date重复,实际情况有所不同。
答案 0 :(得分:0)
如果我的理解正确,那么您基本上希望count(distinct)
作为窗口函数。 Postgres尚不支持。但是您可以在子查询中使用select distinct
:
select o.*,
COALESCE(COUNT(*) OVER (partition by o.customer_id
order by o.order_date
range between interval '100 years' PRECEDING AND CURRENT ROW EXCLUDE CURRENT ROW),
0) AS customer_orders_count_up_to_now,
COALESCE(COUNT(*) OVER (partition by o.customer_id
order by o.order_date asc
range BETWEEN interval '7 days' PRECEDING AND CURRENT ROW EXCLUDE CURRENT ROW),
0) AS customer_orders_last_seven_days
from (SELECT DISTINCT o.customer_id, o.order_date from orders o) o
答案 1 :(得分:0)
我找到了解决方案,如果其他人有相同的问题,我在这里分享:
select *,COALESCE(COUNT(*) OVER (partition by orders.customer_id order by orders.order_date range between interval '100 years' PRECEDING AND
CURRENT ROW EXCLUDE CURRENT ROW),0) - COALESCE(COUNT(*) OVER (partition by orders.customer_id,orders.order_date order by orders.order_date range between interval '100 years' PRECEDING AND
CURRENT ROW EXCLUDE CURRENT ROW),0) AS
customer_orders_count_up_to_now,
COALESCE(COUNT(*) OVER (partition by orders.customer_id order by
orders.order_date asc range BETWEEN interval '7 days' PRECEDING
AND CURRENT ROW EXCLUDE CURRENT ROW),0) - COALESCE(COUNT(*) OVER (partition by orders.customer_id order,orders.order_date by
orders.order_date asc range BETWEEN interval '7 days' PRECEDING
AND CURRENT ROW EXCLUDE CURRENT ROW),0) AS
customer_orders_last_seven_days from orders
这个想法是,为了从滚动计数中删除重复计数,我们应该从计算的滚动计数中减去在order_time中有重复的那些记录的计数。