计算日期差异的不匹配

时间:2012-07-23 22:12:10

标签: sql join hive

以下是表1中的数据

BUYER_ID   |   ITEM_ID         |    CREATED_TIME
-----------+-------------------+------------------------
1345653        110909316904         2012-07-09 21:29:06
1345653        151851771618         2012-07-09 19:57:33
1345653        221065796761         2012-07-09 19:31:48
1345653        400307563710         2012-07-09 18:57:33
1345653        310411560125         2012-07-09 16:09:49
1345653        120945302103         2012-07-09 13:40:23
1345653        261060982989         2012-07-09 09:02:21

以下是表2中的数据

USER_ID   |   PRODUCT_ID           |    LAST_TIME
-----------+-------------------+----------------------
1345653       110909316904         2012-07-09 21:30:06
1345653       151851771618         2012-07-09 19:57:33
1345653       221065796761         2012-07-09 19:31:48
1345653       400307563710         2012-07-09 18:57:33

问题陈述: -

我需要在Table2Table1上将BUYER_IDUSER_ID进行比较。如果CREATED_TIMELAST_TIME之间的差异大于15 minutes

,我需要查找不匹配的计数

因此,如果您查看上面的示例,请参阅表ITEM_IDPRODUCT_ID中的第一行相同,但CREATED_TIMELAST_TIME不相同,并且这两次之间的差异只有1分钟。因此,如果差异大于15分钟,那么我想将它们显示为错误。所以预期的输出将是上述情况 -

BUYER_ID    ERROR
1345653       1

1 个答案:

答案 0 :(得分:1)

首先,找到所有匹配的买家:

select *
from table1 t1 join
     table2 t2
     on t1.buyer_id = t2.user_id and
        datediff(min, t1.created_time, t2.last_time) between -15 and 15

使用它,现在找到没有匹配的情况:

with matches as (
     select *
     from table1 t1 join
          table2 t2
          on t1.buyer_id = t2.user_id and
             datediff(min, t1.created_time, t2.last_time) between -15 and 15
    )
select *
from table1 t1 left outer join
     matches m
     on t1.buyer_id = m.user_id and
        t1.product_id = m.product_id and
        t1.created_time = m.created_time
where m.buyer_id is null