如何在Hive中最后一次出现单词后提取字符串

时间:2018-12-18 17:03:43

标签: hive

我的一个Hive表中有我的字符串列

select * from
(
select "edition_xx/news/radio_today_news_xx" as my_column
union all 
select "edition_xx/news/news/television_1.3" as my_column
) A 

我想提取news/之后的字符串部分。所以我的输出列看起来像

radio_today_news_xx
television_1.3

如何在Hive中使用正则表达式提取此内容?请注意,news/可以出现X次,并且我希望在字符串最后一次出现之后。

2 个答案:

答案 0 :(得分:2)

使用split()

select  split(my_column,'(news/)+')[1] 
from
(
select "edition_xx/news/radio_today_news_xx" as my_column
union all 
select "edition_xx/news/news/television_1.3" as my_column
) A;

此正则表达式表示news/一次或多次

结果:

radio_today_news_xx
television_1.3
Time taken: 37.218 seconds, Fetched: 2 row(s)

答案 1 :(得分:0)

使用拆分获取最后一次出现

pickle.dump()

输出

select split(A.my_column,'news\/')[size(split(A.my_column,'news\/'))-1] lt
    from
    (
    select "edition_xx/news/radio_today_news_xx" as my_column
    union all 
    select "edition_xx/news/news/television_1.3" as my_column
    union all
    select "edition_xx/news/radio_today/news_xx" as my_column
    )