需要帮助将以下字符串转换为所需的格式。我将具有以下几个值。是否可以使用REGEXP或更好的方法轻松实现此目的?
当前格式来自A列
Region[Envionment Lead|||OTC|||06340|||List Program|||TX|||Z3452|||Souther Region 05|||M7894|||California Divison|||Beginning]
Region[Coding Analyst|||BA|||04561|||Water Bridge|||CA|||M8459|||West Region 09|||K04956|||East Division|||Supreme]
A列的必需格式
Region[actingname=Envionment Lead,commonid=OTC,insturmentid=06340,commonname=List Program]
Region[actingname=Coding Analyst,commonid=BA,insturmentid=04561,commonname=Water Bridge]
修订后的数据
**Column data**
Region[Coding Analyst|||BA|||reg pro|||04561|||08/16/2011|||Board member|||AZ|||06340|||Whiter Bridge|||CA|||M0673|||West Region 09|||K04956|||East Division|||Supreme]
**required Data**
{actingname=06340, actingid=M0673, insturmentid=BA, insturmentname=Coding Analyst, commonname=West Region 09, stdate=08/16/2011, linnumber=04561, linstate=CA, linname=Supreme}
问题在于获取字符串的10、11、12和15位置。我可以得到低于10位的任何东西,但不能达到10位或更多的字符串位置。你能指导我在这里我想念的吗
'{actingname=\8,actingid=\11,insturmentid=\2,insturmentname=\1,commonname=\12, stdate=\5,linnumber=4,linstate=10,linname=15}'--Here 10,11,12 and 15 posistion are not being fethched
答案 0 :(得分:2)
我使用了REGEXP_REPLACE
import requests
from bs4 import BeautifulSoup
query = "deep"
yahoo = "https://search.yahoo.com/search?q=" + query + "&n=" + str(10)
raw_page = requests.get(yahoo)
soup = BeautifulSoup(raw_page.text)
for link in soup.find_all(attrs={"class": "ac-algo fz-l ac-21th lh-24"}):
print (link.text, link.get('href'))
或者像更新一样
SELECT REGEXP_REPLACE(
'Region[Envionment Lead|||OTC|||06340|||List Program|||TX|||Z3452|||Souther Region 05|||M7894|||California Divison|||Beginning]',
'^Region\[([[:alpha:][:space:][:digit:]]*)\|\|\|([[:alpha:]]*)\|\|\|([[:digit:]]*)\|\|\|([[:alpha:][:space:][:digit:]]*).*',
'Region[actingname=\1,commonid=\2,instrumentid=\3,commonname=\4]') as replaced
FROM dual
答案 1 :(得分:0)
您可以连续使用regexp_substr
和listagg
with t1(str1) as
(
select 'Region[Coding Analyst|||BA|||04561|||Water Bridge]' from dual
), t2(str2) as
(
select 'actingname,commonid,insturmentid,commonname' from dual
), t3 as
(
select regexp_substr(str1, '[^|||]+', 1, level) str1,
regexp_substr(str2, '[^,]+', 1, level)||'=' str2,
level as lvl
from t1
cross join t2
connect by level <= regexp_count(str1, '[^|||]+')
), t4 as
(
select case when lvl = 1 then
replace(str1,'[','['||str2)
else
str2||str1
end as str, lvl
from t3
)
select listagg(str,',') within group (order by lvl) as "Result String" from t4;
Result String
----------------------------------------------------------------------------------------
Region[actingname=Coding Analyst,commonid=BA,insturmentid=04561,commonname=Water Bridge]
P.S。我将第二个作为样本,由于第三个字符串以等号结尾的元组标签为4,因此将第一个字符串取为4,这是因为三串分隔的子字符串的数量。 Demo
答案 2 :(得分:0)
这将起作用:
select substr(regexp_replace(regexp_replace(regexp_replace
(regexp_replace(regexp_replace("col1",'\[','[actingname='),
'\|\|\|',',commonid=',1,1,'i'),
'\|\|\|',',insturmentid=',1,1,'i'),
'\|\|\|',',commonname=',1,1,'i'),
'\|',']',1,1,'i'),
1,regexp_instr(regexp_replace(regexp_replace(regexp_replace
(regexp_replace(regexp_replace("col1",'\[','[actingname='),
'\|\|\|',',commonid=',1,1,'i'),
'\|\|\|',',insturmentid=',1,1,'i'),
'\|\|\|',',commonname=',1,1,'i'),
'\|',']',1,1,'i'),'\]')-1 )||']'
from Table1;
检查: http://sqlfiddle.com/#!4/3ddfa0/11
谢谢!!!!!!