状态开始日期和结束日期以及同一用户/客户的多​​个状态更改-SQL

时间:2019-05-16 21:37:21

标签: sql

我有一个客户,其状态已多次更改,我需要获取状态生命周期(特定状态内的开始日期和结束日期),如果状态再次返回,它将显示状态更改时的最新日期(例如:活动广告应排在下面两行,其中一排是旧日期,另一排是最近日期) 请帮助此HiveQL / SQL

Customer  Status    date
abc active           5/1
abc active           5/2
abc active           5/3
abc temp deactivate  5/4
abc temp deactivate  5/5
abc deactivate   5/6
abc active           5/7
abc active           5/8
abc active           5/9
abc active           5/10

输出:

customer status       start date    end date
abc  active       5/1           5/3
abc  temp deactivate  5/4           5/5
abc  deactivate   5/6           5/6
abc  active           5/7           5/10

2 个答案:

答案 0 :(得分:0)

这不是一个完整的答案,但是很接近。希望有人可以以此为基础来完成答案。

DECLARE @t TABLE (
    customer VARCHAR(3),
    status VARCHAR(15),
    date DATE
    );

INSERT 
    INTO @t (customer, [status], [date]) 
    VALUES
        ('abc','active','5/1/2019'),
        ('abc','active','5/2/2019'),
        ('abc','active','5/3/2019'),
        ('abc','temp deactivate','5/4/2019'),
        ('abc','temp deactivate','5/5/2019'),
        ('abc','deactivate','5/6/2019'),
        ('abc','active','5/7/2019'),
        ('abc','active','5/8/2019'),
        ('abc','active','5/9/2019'),
        ('abc','active','5/10/2019');


;WITH 
    cte1 AS (
        SELECT 
            t.[customer],
            t.[status],
            LAG(t.[status], 1, NULL) OVER (ORDER BY t.[date]) AS [prev_status],
            LEAD(t.[status], 1, NULL) OVER (ORDER BY t.[date]) AS [next_status],
            t.[date]
        FROM @t AS t)
    ,cte2 AS (
        SELECT 
            cte1.[customer],
            cte1.[status],
            CASE WHEN cte1.[prev_status] = cte1.[status] THEN NULL ELSE cte1.[date] END AS [min],
            CASE WHEN cte1.[next_status] = cte1.[status] THEN NULL ELSE cte1.[date] END AS [max],
            cte1.[date]
        FROM cte1)
SELECT
    cte2.[customer],
    cte2.[status], 
    cte2.[min] AS [start_date], 
    cte2.[max] AS [end_date]
FROM cte2 ;

这将返回以下未折叠的结果:

enter image description here

答案 1 :(得分:0)

好的,我现在已经解决了这个问题,只需两次使用窗口函数row_number,SQL如下:

select
    customer,
    status,
    min(date) as start_date,
    max(date) as end_date
from
    (
    select
        date,
        customer,
        status,
        row_number() over (order by date) as seq_num,
        row_number() over (partition by customer,status order by date) as seqnum_s
    from
        customer_table
    ) as tmp
group by
    customer,
    status,
    seq_num-seqnum_s
order by
    start_date;
 customer |     status      | start_date |  end_date  
----------+-----------------+------------+------------
 abc      | active          | 2019-05-01 | 2019-05-03
 abc      | temp deactivate | 2019-05-04 | 2019-05-05
 abc      | deactivate      | 2019-05-06 | 2019-05-06
 abc      | active          | 2019-05-07 | 2019-05-10
(4 rows)