Row EventType CloudId ts
1 stop 5201156607311872 2018-07-07 12:25:21 UTC
2 start 5201156607311872 2018-07-07 12:27:39 UTC
3 start 5201156607311872 2018-07-07 12:28:15 UTC
4 stop 5738776789778432 2018-07-07 12:28:54 UTC
5 stop 5201156607311872 2018-07-07 12:30:30 UTC
6 stop 5738776789778432 2018-07-07 12:37:45 UTC
7 stop 5738776789778432 2018-07-07 12:40:52 UTC
我有一个如上所述的表结构。我只想过滤行EventType
更改之前的第一个事件。即row 2
和row 3
具有相同的EventType
,我需要从表中删除row 3
。 row 4,5,6,7
具有相同的EventType
,我要保留row 4
并删除row 5,6,7
。
答案 0 :(得分:3)
使用lag()
:
select t.*
from (select t.*,
lag(eventtype) over (order by row) as prev_eventtype
from t
) t
where prev_eventtype is null or prev_eventtype <> eventtype;
答案 1 :(得分:3)
以下是用于BigQuery标准SQL
#standardSQL
SELECT * EXCEPT(prev_eventtype) FROM (
SELECT *, LAG(eventtype) OVER (ORDER BY ts) AS prev_eventtype
FROM `project.dataset.table`
)
WHERE prev_eventtype IS NULL OR prev_eventtype <> eventtype
您可以使用问题中的虚拟数据进行上述测试和操作:
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'stop' EventType, 5201156607311872 CloudId, TIMESTAMP '2018-07-07 12:25:21 UTC' ts UNION ALL
SELECT 'start', 5201156607311872, '2018-07-07 12:27:39 UTC' UNION ALL
SELECT 'start', 5201156607311872, '2018-07-07 12:28:15 UTC' UNION ALL
SELECT 'stop', 5738776789778432, '2018-07-07 12:28:54 UTC' UNION ALL
SELECT 'stop', 5201156607311872, '2018-07-07 12:30:30 UTC' UNION ALL
SELECT 'stop', 5738776789778432, '2018-07-07 12:37:45 UTC' UNION ALL
SELECT 'stop', 5738776789778432, '2018-07-07 12:40:52 UTC'
)
SELECT * EXCEPT(prev_eventtype) FROM (
SELECT *, LAG(eventtype) OVER (ORDER BY ts) AS prev_eventtype
FROM `project.dataset.table`
)
WHERE prev_eventtype IS NULL OR prev_eventtype <> eventtype
结果:
EventType CloudId ts
stop 5201156607311872 2018-07-07 12:25:21 UTC
start 5201156607311872 2018-07-07 12:27:39 UTC
stop 5738776789778432 2018-07-07 12:28:54 UTC
答案 2 :(得分:1)
select
Row,
EventType,
CloudId,
ts
from
(
select
Row,
EventType,
CloudId,
ts,
row_number() over (partition by EventType order by CloudId,Row) as rnk
from table_name
)evnt where rnk=1
答案 3 :(得分:1)
您可以使用SELECT
语句仅隐藏不需要的行:
select t.*
from table t
where t.row = (select min(t1.row) from table t1 where t1.EventType = t.EventType);