SQL,ORACLE - 修剪右边的字符串(从URL中删除所有参数)

时间:2015-11-11 11:28:11

标签: sql oracle oracle11g

我使用网址查询列。这些URL来自不同的来源并具有不同的格式。其中一些有参数。我希望查询此列并右键修剪第一个参数符号中的URL。

示例网址:

URLs
http://www.domain1.com/path/page?parameters1&parameters2
https://www.domain2.com/path/page?parameters1&parameters2/somemorestufftoscrape
domain3.com/path/page?parameters1&parameters2
http://www.domain4.com/path/page&parameters1?parameters2
https://www.domain5.com/path/noparametershere.html
domain6.com/path/page=?parameters1&parameters2

我想要修剪一切?,&,=(表示我的情况参数的字符列表)。

期望的输出:

TrimmedURLs
http://www.domain1.com/path/page
https://www.domain2.com/path/page
domain3.com/path/page
http://www.domain4.com/path/page
https://www.domain5.com/path
domain6.com/path/page

我尝试使用RTRIM,如下所示:

select
   URLs
   rtrim(URLs, '?=&') as TrimmedURLs
from
   MyTable;

查询返回但是URL列等于TrimmedURLs(我做错了吗?)。

我尝试使用regexp_substr,但是在有多个参数租船人的情况下,它会从最后一个而不是第一个修剪(参见页面中的第一个注释)。

  1. 对所需结果的查询是什么?
  2. 为什么RTRIM对我不起作用?
  3. 服务器是Oracle 11g 网址类型为VARCHAR2(1024)

    谢谢!

2 个答案:

答案 0 :(得分:3)

REGEXP_SUBSTR()听起来像是在这里使用的东西:

with sample_data as (select 'http://www.domain1.com/path/page?parameters1&parameters2' url from dual union all
                     select 'https://www.domain2.com/path/page?parameters1&parameters2/somemorestufftoscrape' url from dual union all
                     select 'domain3.com/path/page?parameters1&parameters2' url from dual union all
                     select 'http://www.domain4.com/path/page&parameters1?parameters2' url from dual union all
                     select 'https://www.domain5.com/path/noparametershere.html' url from dual union all
                     select 'domain6.com/path/page=?parameters1&parameters2' url from dual)
select url,
       regexp_substr(url, '[^?&=]+', 1, 1) main_url
from   sample_data;

URL                                                                             MAIN_URL                                                    
------------------------------------------------------------------------------- ------------------------------------------------------------
http://www.domain1.com/path/page?parameters1&parameters2                        http://www.domain1.com/path/page                            
https://www.domain2.com/path/page?parameters1&parameters2/somemorestufftoscrape https://www.domain2.com/path/page                           
domain3.com/path/page?parameters1&parameters2                                   domain3.com/path/page                                       
http://www.domain4.com/path/page&parameters1?parameters2                        http://www.domain4.com/path/page                            
https://www.domain5.com/path/noparametershere.html                              https://www.domain5.com/path/noparametershere.html          
domain6.com/path/page=?parameters1&parameters2                                  domain6.com/path/page 

答案 1 :(得分:-1)

如果您不喜欢regexp,还可以使用substrinstr功能的组合:

with sample_data as (select 'http://www.domain1.com/path/page?parameters1&parameters2' url from dual union all
                 select 'https://www.domain2.com/path/page?parameters1&parameters2/somemorestufftoscrape' url from dual union all
                 select 'domain3.com/path/page?parameters1&parameters2' url from dual union all
                 select 'http://www.domain4.com/path/page&parameters1?parameters2' url from dual union all
                 select 'https://www.domain5.com/path/noparametershere.html' url from dual union all
                 select 'domain6.com/path/page=?parameters1&parameters2' url from dual)
select 
    url,
    substr(url, 0, instr(url,'?')-1) main_url
from 
    sample_data