与此问题相似......
How can I use regex to split a string, using a string as a delimiter?
...我正在尝试拆分以下字符串:
Spent 30 CAD in movie tickets at Cineplex on 2018-06-01
我想要的输出是:
ELEMENT ELEMENT_VALUE
------- -------------
1 Spent
2 30
3 CAD
4 movie tickets
5 Cineplex
6 2018-06-01
同样,它应该能够处理:
Paid 600 EUR to Electric Company
产:
ELEMENT ELEMENT_VALUE
------- -------------
1 Paid
2 600
3 EUR
4
5 Electric Company
我试过这个正则表达式无济于事:
(\w+)(\D+)(\w+)(?(?=in)(\w+)(at)(\w+)(on)(.?$)|((?=to)(\w+)(.?$)))
我看了几个正则表达式网站加上这篇文章没有多少运气:
Extract some part of text separated by delimiter using regex
有人可以帮忙吗?
答案 0 :(得分:1)
这是一个打破空间的简单SQL标记生成器:
select regexp_substr('Spent 30 CAD in movie tickets at Cineplex on 2018-06-01','[^ ]+', 1, level) from dual
connect by regexp_substr('Spent 30 CAD in movie tickets at Cineplex on 2018-06-01', '[^ ]+', 1, level) is not null
答案 1 :(得分:0)
您需要的输出有两个问题。第一个是如何定义要排除的标记('on','at'等)。第二个是如何忽略某些标记中的空间(“电子公司”,“电影票”)。
用两步法解决第一点很容易。步骤#1将字符串拆分为空格,步骤#2删除不需要的标记:
with exclude as (
select 'in' as tkn from dual union all
select 'at' as tkn from dual union all
select 'to' as tkn from dual union all
select 'on' as tkn from dual
)
, str as (
select id
, level as element_order
, regexp_substr(txt, '[^ ]+', 1, level) as tkn
from t23
where id = 10
CONNECT BY level <= regexp_count(txt, '[^ ]+')+1
and id = prior id
and prior sys_guid() is not null
)
select row_number() over (partition by str.id order by str.element_order) as element
, str.tkn as element_value
from str
left join exclude on exclude.tkn = str.tkn
where exclude.tkn is null
and str.tkn is not null
;
第二点很难解决。我想你需要另一个查找表来识别铃声,并且可能使用listagg()
来连接它们。