我使用网址查询列。这些URL来自不同的来源并具有不同的格式。其中一些有参数。我希望查询此列并右键修剪第一个参数符号中的URL。
示例网址:
URLs
http://www.domain1.com/path/page?parameters1¶meters2
https://www.domain2.com/path/page?parameters1¶meters2/somemorestufftoscrape
domain3.com/path/page?parameters1¶meters2
http://www.domain4.com/path/page¶meters1?parameters2
https://www.domain5.com/path/noparametershere.html
domain6.com/path/page=?parameters1¶meters2
我想要修剪一切?,&,=(表示我的情况参数的字符列表)。
期望的输出:
TrimmedURLs
http://www.domain1.com/path/page
https://www.domain2.com/path/page
domain3.com/path/page
http://www.domain4.com/path/page
https://www.domain5.com/path
domain6.com/path/page
我尝试使用RTRIM,如下所示:
select
URLs
rtrim(URLs, '?=&') as TrimmedURLs
from
MyTable;
查询返回但是URL列等于TrimmedURLs(我做错了吗?)。
我尝试使用regexp_substr,但是在有多个参数租船人的情况下,它会从最后一个而不是第一个修剪(参见页面中的第一个注释)。
服务器是Oracle 11g
网址类型为VARCHAR2(1024)
谢谢!
答案 0 :(得分:3)
REGEXP_SUBSTR()
听起来像是在这里使用的东西:
with sample_data as (select 'http://www.domain1.com/path/page?parameters1¶meters2' url from dual union all
select 'https://www.domain2.com/path/page?parameters1¶meters2/somemorestufftoscrape' url from dual union all
select 'domain3.com/path/page?parameters1¶meters2' url from dual union all
select 'http://www.domain4.com/path/page¶meters1?parameters2' url from dual union all
select 'https://www.domain5.com/path/noparametershere.html' url from dual union all
select 'domain6.com/path/page=?parameters1¶meters2' url from dual)
select url,
regexp_substr(url, '[^?&=]+', 1, 1) main_url
from sample_data;
URL MAIN_URL
------------------------------------------------------------------------------- ------------------------------------------------------------
http://www.domain1.com/path/page?parameters1¶meters2 http://www.domain1.com/path/page
https://www.domain2.com/path/page?parameters1¶meters2/somemorestufftoscrape https://www.domain2.com/path/page
domain3.com/path/page?parameters1¶meters2 domain3.com/path/page
http://www.domain4.com/path/page¶meters1?parameters2 http://www.domain4.com/path/page
https://www.domain5.com/path/noparametershere.html https://www.domain5.com/path/noparametershere.html
domain6.com/path/page=?parameters1¶meters2 domain6.com/path/page
答案 1 :(得分:-1)
如果您不喜欢regexp
,还可以使用substr
和instr
功能的组合:
with sample_data as (select 'http://www.domain1.com/path/page?parameters1¶meters2' url from dual union all
select 'https://www.domain2.com/path/page?parameters1¶meters2/somemorestufftoscrape' url from dual union all
select 'domain3.com/path/page?parameters1¶meters2' url from dual union all
select 'http://www.domain4.com/path/page¶meters1?parameters2' url from dual union all
select 'https://www.domain5.com/path/noparametershere.html' url from dual union all
select 'domain6.com/path/page=?parameters1¶meters2' url from dual)
select
url,
substr(url, 0, instr(url,'?')-1) main_url
from
sample_data