这是我的示例数据表
row# date customerid event itemid-A Itemid-B
1 5/1/17 4c9b3705121ac1493640912601 page load 473685
2 5/1/17 11dacfc4251da01493672636536 page load 863438
3 5/1/17 11dacfc4251da01493672636536 click 863438 45485
条件#1:我需要从数据中删除第2行,因为它是第3行的重复客户ID。基本上删除页面加载事件并在customerid重复时保持单击事件。 Click事件将具有唯一的Itemid-B
条件#2:当没有重复的customerid时,我需要在#1行中保持页面加载事件。
答案 0 :(得分:1)
select dt,customerid,event,itemid_A,Itemid_B
from (select *
,row_number() over
(
partition by customerid
order by field(event,'click','page load')
) as rn
from mytable
) t
where rn = 1
;
+------------+-----------------------------+-----------+----------+----------+
| dt | customerid | event | itemid_a | itemid_b |
+------------+-----------------------------+-----------+----------+----------+
| 2017-05-01 | 11dacfc4251da01493672636536 | click | 863,438 | 45,485 |
| 2017-05-01 | 4c9b3705121ac1493640912601 | page load | 473,685 | (null) |
+------------+-----------------------------+-----------+----------+----------+