在滚动日期范围内选择多个订单的客户ID计数

时间:2015-07-21 08:51:36

标签: oracle date subquery aggregate

我的数据看起来像这样:

order_date  phone_number  order_number
----------  ------------  ------------
18/03/2015  0912345678    123
27/03/2015  0912345678    176
18/03/2015  0973541893    453
20/03/2015  0565741534    678    
03/04/2015  0565741534    534

我希望能够编写一个查询,根据'order_date'查看今天和之前的9天(或任何其他天数)作为10天滚动窗口,并返回' phone_number'具有多个订单,以及具有单个订单的'phone_number'计数,例如

date_from   date_to      count_multiple  count_single
----------  -----------  --------------  ------------
18/03/2015  27/03/2015   5               15
19/03/2015  28/03/2015   7               10
20/03/2015  29/03/2015   6               11
21/03/2015  30/03/2015   3               17

我可以执行SELECT语句的日期计算部分,例如:

SELECT DISTINCT order_date - 9 AS date_from, order_date AS date_to
FROM orders
WHERE order_date > ((SELECT MIN(order_date) FROM orders) + 9)
ORDER BY order_date;

......如果我指定确切的参数,我可以得到我想要的,例如2015年3月18日至28日期间的多个订单:

SELECT DISTINCT COUNT(*) FROM (
  SELECT phone_number, count(order_number) FROM orders
  WHERE order_date BETWEEN to_date('18/03/2015', 'dd/mm/yyyy') 
                       AND to_date('27/03/2015', 'dd/mm/yyyy')
  HAVING COUNT(order_number) > 1
  GROUP BY phone_number
) multiple_orders

......单个订单也一样......

SELECT DISTINCT COUNT(*) FROM (
  SELECT phone_number, count(order_number) FROM orders
  WHERE order_date BETWEEN to_date('18/03/2015', 'dd/mm/yyyy') 
                       AND to_date('27/03/2015', 'dd/mm/yyyy')
  HAVING COUNT(order_number) = 1
  GROUP BY phone_number
) single_orders

但是,基于前两个日期列,我无法弄清楚如何将这些包含在主SELECT子句中作为子查询。

我想写这样的东西:

SELECT 
  o.order_date - 9 AS date_from, 
  o.order_date AS date_to,
  (SELECT DISTINCT COUNT(*) FROM 
    (SELECT x.phone_number, COUNT(x.order_number) FROM orders x 
      WHERE x.order_date BETWEEN (o.order_date - 9)
                             AND  o.order_date
      HAVING COUNT(x.order_number) > 1
      GROUP BY x.phone_number
    )
  ) AS Has_Multiple, 
  (SELECT DISTINCT COUNT(*) FROM 
    (SELECT x.phone_number, COUNT(x.order_number) FROM orders x 
      WHERE x.order_date BETWEEN (o.order_date - 9)
                             AND  o.order_date
      HAVING COUNT(x.order_number) = 1
      GROUP BY x.phone_number
    )
  ) AS Has_Single 
FROM orders o
WHERE o.order_date > ((SELECT MIN(order_date) FROM orders) + 9)
ORDER BY o.order_date;

当然,上述方法不起作用,但我真正想要的是能够让第3列和第4列中的每个计数都基于第1列和第2列(其中1从2计算) 。

当前错误是:

ORA-00904: "O"."ORDER_DATE": invalid identifier

注意,如果我在SELECT语句中不包含子查询,则不会收到任何错误。所以我似乎没有正确地执行子查询,因为主查询无法在嵌套子查询中“看到”:(

我通过在这里和Google上搜索找到了所有单独的组件......但我似乎无法将它们组合起来...尤其是这个“滚动日期窗口”的概念。

非常感谢任何帮助!

1 个答案:

答案 0 :(得分:1)

我认为您可以通过使用具有适当窗口子句的分析函数来实现您的目标。由于您没有提供与预期输出数据匹配的样本输入数据,因此我必须自己提供 - 我只能假设我的逻辑正确;你必须仔细检查它。我将窗口从9天减少到3天(好吧,从技术上讲,我认为它是10到4天,但是谁在计算?!* {;-))

with sample_data as (select 1 id, 1 num, trunc(sysdate, 'mm') + 1 dt from dual union all
                     select 2 id, 2 num, trunc(sysdate, 'mm') + 1 dt from dual union all
                     select 3 id, 3 num, trunc(sysdate, 'mm') + 1 dt from dual union all
                     select 4 id, 1 num, trunc(sysdate, 'mm') + 2 dt from dual union all
                     select 5 id, 2 num, trunc(sysdate, 'mm') + 2 dt from dual union all
                     select 6 id, 4 num, trunc(sysdate, 'mm') + 2 dt from dual union all
                     select 7 id, 4 num, trunc(sysdate, 'mm') + 3 dt from dual union all
                     select 8 id, 1 num, trunc(sysdate, 'mm') + 3 dt from dual union all
                     select 9 id, 7 num, trunc(sysdate, 'mm') + 3 dt from dual union all
                     select 10 id, 6 num, trunc(sysdate, 'mm') + 4 dt from dual union all
                     select 11 id, 6 num, trunc(sysdate, 'mm') + 4 dt from dual union all
                     select 12 id, 5 num, trunc(sysdate, 'mm') + 4 dt from dual union all
                     select 13 id, 6 num, trunc(sysdate, 'mm') + 5 dt from dual union all
                     select 14 id, 9 num, trunc(sysdate, 'mm') + 5 dt from dual union all
                     select 15 id, 3 num, trunc(sysdate, 'mm') + 5 dt from dual union all
                     select 16 id, 2 num, trunc(sysdate, 'mm') + 6 dt from dual),
             res as (select id,
                            num,
                            dt st_dt,
                            dt + 3 end_dt,
                            count(*) over (partition by num order by dt
                                           range between current row and 3 following) cnt_num_curr_and_next_3_days
                     from   sample_data)
select st_dt,
       end_dt,
       count(case when cnt_num_curr_and_next_3_days > 1 then 1 end) count_multiple,
       count(case when cnt_num_curr_and_next_3_days = 1 then 1 end) count_single
from   res
group by st_dt,
         end_dt
order by st_dt;

ST_DT      END_DT     COUNT_MULTIPLE COUNT_SINGLE
---------- ---------- -------------- ------------
02/07/2015 05/07/2015              2            1
03/07/2015 06/07/2015              2            1
04/07/2015 07/07/2015              0            3
05/07/2015 08/07/2015              2            1
06/07/2015 09/07/2015              0            3
07/07/2015 10/07/2015              0            1

我越是看这个,我认为解析功能越不能胜任这项工作,至少不是靠自己的工作。

模型子句可能是最好的解决方案,但不幸的是,我不是模特专家!