SQL查询以查找第二个卖方的买方第一笔订单的日期

时间:2019-10-14 15:53:39

标签: sql amazon-redshift

我有一组由不同买家向不同卖家进行的购买数据,如下所示:

buyerid || sellerid || orderid || timestamp 

John123 || SellerABC || 123-abc-x1z || 26/07/2019
John123 || SellerABC || 123-abc-i9h || 28/07/2019
John123 || SellerABC || 123-abc-y16 || 28/07/2019
John123 || SellerDEF || 123-def-u13 || 30/07/2019
Bill456 || SellerABC || 456-abc-o34 || 02/08/2019
Bill456 || SellerABC || 456-abc-l3q || 09/08/2019
Bill456 || SellerABC || 456-abc-j5d || 10/08/2019
Bill456 || SellerDEF || 456-def-i61 || 11/08/2019

我希望能够在SQL中创建一个视图,该视图可检索买家首次从SECOND卖家下订单的时间戳。如果没有第二个卖家的第一笔订单,那么应该有一个空条目。结果视图应如下所示:

buyerid || first_order_second_seller_timestamp 

John123 || 30/07/2019
Bill456 || 11/08/2019

我想会有一些疯狂的分区和子查询来实现这一目标,但是任何帮助将不胜感激!目前,我只能使用标准SQL函数检索第一个和最后一个订单:

SELECT
  "buyerid"
, "min"("timestamp") "first_order_timestamp"
, "max"("timestamp") "last_order_timestamp"
FROM
  default.order_table
GROUP BY "buyerid"

2 个答案:

答案 0 :(得分:0)

嗯。 。 。这有点棘手。这是使用lag()的一种方法:

select buyerid, min(timestamp)
from (select t.*,
             lag(sellerid) over (partition by buyerid order by timestamp) as prev_sellerid
      from order_table t
     ) t
where prev_sellerid <> sellerid   -- also filters out `NULL` values
group by buyerid;

要获取NULL值,请将过滤条件移至条件聚合:

select buyerid, min(case when prev_sellerid <> sellerid then timestamp end)
from (select t.*,
             lag(sellerid) over (partition by buyerid order by timestamp) as prev_sellerid
      from order_table t
     ) t
group by buyerid;

编辑:

您还可以使用两种聚合级别:

select buyerid, min(case when timestamp = 2 then min_timestamp end)
from (select buyerid, sellerid, min(timestamp) as min_timestamp,
             row_number() over (partition by buyerid order by min(timestamp)) as seqnum
      from order_table t
      group by buyerid, sellerid
     ) bs
group by buyerid;

这也泛化为 nth 卖方ID。

答案 1 :(得分:0)

您需要通过每个买方的卖方ID的排名获得第二个最大出价

Returns