Question

我需要你对regexp_replace函数的帮助。我有一个表，其中包含一个包含重复项的串联字符串值的列。我该如何消除它们？

示例：

Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha

我需要输出

Ian,Beatty,Larry,Neesha

副本是随机的，没有任何特定的顺序。

更新 -

这是我的表格外观

ID   Name1   Name2    Name3     
1     a       b         c
1     c       d         a
2     d       e         a
2     c       d          b

我需要每个ID有一行，在一行中有一个不同的name1，name2，name3作为逗号分隔的字符串。

ID    Name
1     a,c,b,d,c
2     d,c,e,a,b

我尝试过使用不同的listagg，但我无法删除重复项。

Answer 1

所以，试试这个......

([^,]+),(?=.*[A-Za-z],[] ]*\1)

Answer 2

如果重复的值不是彼此相邻的话，我认为只能使用regexp_replace来执行此操作。一种方法是将值拆分，消除重复，然后将它们重新组合在一起。

标记分隔字符串的常用方法是使用regexp_substr和connect by子句。在字符串中使用绑定变量可以使代码更清晰：

var value varchar2(100);
exec :value := 'Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha';

select regexp_substr(:value, '[^,]+', 1, level) as value
from dual
connect by regexp_substr(:value, '[^,]+', 1, level) is not null;

VALUE                        
------------------------------
Ian                           
Beatty                        
Larry                         
Neesha                        
Beatty                        
Neesha                        
Ian                           
Neesha

您可以将其用作子查询（或CTE），从中获取不同的值，然后使用listagg重新组合：

select listagg(value, ',') within group (order by value) as value
from (
  select distinct value from (
    select regexp_substr(:value, '[^,]+', 1, level) as value
    from dual
    connect by regexp_substr(:value, '[^,]+', 1, level) is not null
  )
);

VALUE                        
------------------------------
Beatty,Ian,Larry,Neesha

如果您查看表格中的多行，因为混淆了连接语法，但这样做会有点复杂，但您可以使用非确定性引用来避免循环：

with t42 (id, value) as (
  select 1, 'Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha' from dual
  union all select 2, 'Mary,Joe,Mary,Frank,Joe' from dual
)
select id, listagg(value, ',') within group (order by value) as value
from (
  select distinct id, value from (
    select id, regexp_substr(value, '[^,]+', 1, level) as value
    from t42
    connect by regexp_substr(value, '[^,]+', 1, level) is not null
    and id = prior id
    and prior dbms_random.value is not null
  )
)
group by id;

        ID VALUE                        
---------- ------------------------------
         1 Beatty,Ian,Larry,Neesha       
         2 Frank,Joe,Mary

当然，如果你正确存储关系数据，这是不必要的;在列中有一个分隔的字符串不是一个好主意。

Answer 3

实现此目的的最佳方法是跳过RegEx并在逗号上分割名称，将结果列表转换为集合，然后在'上使用连接字符串方法，并传入集合。

>>> names = 'Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha'
>>> deduped_names = ','.join(set(names.split(',')))
>>> print(deduped_names)
Neesha,Ian,Larry,Beatty

从Oracle中的逗号分隔字符串中删除重复值

3 个答案: