我有一个Hive表列,其字符串用' - '分隔,我需要在' - '的第一次和最后一次出现之间提取字符串
+-----------------+
| col1 |
+-----------------+
| abc-123-na-00-sf|
| 123-abc-01-sd |
| 123-abcd-sd |
+-----------------+
Required output:
+-----------+
| col1 |
+-----------+
| 123-na-00 |
| abc-01 |
| abcd |
+-----------+
请建议使用一些正则表达式来提取所需的输出。
由于
答案 0 :(得分:3)
with t as (select explode(array('abc-123-na-00-sf','123-abc-01-sd','123-abcd-sd')) as str)
select regexp_extract (str,'-(.*)-',1)
from t
;
123-na-00
abc-01
abcd
或
with t as (select explode(array('abc-123-na-00-sf','123-abc-01-sd','123-abcd-sd')) as str)
select regexp_extract (str,'(?<=-).*(?=-)',0)
from t
;
123-na-00
abc-01
abcd