查询以在列中的值更改时捕获行

时间:2018-04-30 14:24:31

标签: hadoop hive

我需要捕获特定列值的变化,例如" Toggle"

我有以下数据:

ID  ROW Toggle  Date
661 1   1   2017-03-01
661 2   1   2017-03-02
661 3   1   2017-03-03
661 4   1   2017-03-04
661 5   1   2017-03-05
661 6   1   2017-03-06
661 7   1   2017-03-07
661 8   1   2017-03-08
661 9   1   2017-03-09
661 10  1   2017-03-10
661 11  1   2017-03-11
661 12  1   2017-03-12
661 13  1   2017-03-13
661 14  1   2017-03-14
661 15  1   2017-03-15
661 16  1   2017-03-16
661 17  1   2017-03-17
661 18  1   2017-03-18
661 19  1   2017-03-19
661 20  1   2017-03-20
661 21  1   2017-03-21
661 22  1   2017-03-22
661 23  1   2017-03-23
661 24  1   2017-03-24
661 25  1   2017-03-25
661 26  1   2017-03-26
661 27  1   2017-03-27
661 28  1   2017-03-28
661 29  1   2017-03-29
661 30  1   2017-03-30
661 31  1   2017-03-31
661 32  1   2017-04-01
661 33  1   2017-04-02
661 34  1   2017-04-03
661 35  1   2017-04-04
661 36  1   2017-04-05
661 37  0   2017-04-06
661 38  0   2017-04-07
661 39  0   2017-04-08
661 40  0   2017-04-09

使用的查询:

select b.id, b.ROW b.tog, b.ts
from
(select id, ts, tog,
ROW_NUMBER() OVER (order by ts ASC) as ROW
from database.source_table
where id = 661
) b

任何人都可以帮我查询,以便我只能从源表中获取第1行和第37行吗?

1 个答案:

答案 0 :(得分:1)

使用row_number() +过滤器。此查询将输出第1行和第37行:

select b.id, b.ROW, b.toggle, b.date
from
(select id,  date, toggle,
ROW_NUMBER() OVER (partition by id, toggle order by date ASC) as rn,
ROW_NUMBER() OVER (partition by id order by date ASC) as ROW
from test_table
where id = 661
) b
where rn=1
order by date asc

结果: 行

661     1       1       2017-03-01
661     37      0       2017-04-06
Time taken: 192.38 seconds, Fetched: 2 row(s)