每周抓住每位客户的最新记录

时间:2017-11-29 20:37:12

标签: sql impala

我正在尝试查询表并返回按周分组的客户选择的最新组合。例如,这是表格:

(int)        (string)    (int)      (bigint)
Customer ID  Choice      Week       Inserted at
100          a, b, c     2      20171002
100          a, b        2      20171004
101          b, c, d     2      20171002
102          a, c, d     2      20171002
103          a, b, c     2      20171002
103          a, b, d     2      20171003
100          a, b, c, d  3      20171010
101          a, c, d     3      20171010
101          b, c, d     3      20171011
102          a, b        3      20171010
103          a, b, c     3      20171010
103          b, c, d     3      20171012
103          a, b, d     3      20171014

这是我想要生成的查询:

Customer ID  Choice         Week    Inserted at
100          a, b           2       20171004
101          b, c, d        2       20171002
102          a, c, d        2       20171002
103          a, b, d        2       20171003
100          a, b, c, d     3       20171010
101          b, c, d        3       20171011
102          a, b           3       20171010
103          a, b, d        3       20171014

客户每天只能更改一次选择,因此我不必担心客户会在一天内做出很多更改。

这是我开始的,但它缺少很多行。有什么反馈吗?

SELECT c.customer, c.combo, c.week, c.date
FROM tableCombos AS c
WHERE not exists (SELECT *
                  FROM tableCombos AS recent
                  WHERE recent.customer = c.customer
                  AND recent.date > c.date)

2 个答案:

答案 0 :(得分:1)

使用窗口功能:

select tc.*
from (select tc.*,
             row_number() over (partition by customer, week order by date desc) as seqnum
      from tableCombos tc
     ) tc
where seqnum = 1;

答案 1 :(得分:0)

首先,我希望获得按周和客户分组的最大日期:

select customerId , week, max(insertedAt) as date
group by customerId , WEEK

之后我可以将此数据集与主表

连接起来
SELECT c.customer, c.combo, c.week, c.date
FROM tableCombos AS c
join (select customerId , week, max(insertedAt) as date
group by customerId , WEEK) X 
on c.customerId = X.customerId 
                  and c.week = X.week 
                  and c.date = X.date