我正在尝试删除涉及listagg的查询结果中的重复项。 我正在使用这种语法:
For i = 1 To LastRow
sOut = vbNullString
For j = 1 To LastCol
str = .Cells(i, j).Value
MStr = ws2.Cells(i, j).Value
Lstr = Len(str)
rest = MStr - Lstr
sOut = sOut & str & Space(rest)
Next
Print #2, sOut
Next
Next
但是,不会删除包含中文字符的事件:
有什么想法吗?
答案 0 :(得分:1)
您的正则表达式不起作用。如果LISTAGG
输出为A,A,AA
,则正则表达式([^,]+)(,\1)+
不会检查它是否与列表中的完整元素匹配,并且将匹配A,A,A
,即{2}元素列表并将输出AA
而不是预期的A,AA
。更糟糕的是,如果你有字符串BA,BABAB,BABD
,那么正则表达式会将BA,BA
替换为BA
,然后将BAB,BAB
替换为BAB
,最后会得到字符串BABABD
与原始列表中的任何元素都不匹配。
证明这一点的一个例子是:
Oracle 11g R2架构设置:
CREATE TABLE names ( id, name ) AS
SELECT 1, 'A' FROM DUAL UNION ALL
SELECT 2, 'A' FROM DUAL UNION ALL
SELECT 3, 'B' FROM DUAL UNION ALL
SELECT 4, 'C' FROM DUAL UNION ALL
SELECT 5, 'A' FROM DUAL UNION ALL
SELECT 6, 'AA' FROM DUAL UNION ALL
SELECT 7, 'A' FROM DUAL UNION ALL
SELECT 8, 'BA' FROM DUAL UNION ALL
SELECT 9, 'A' FROM DUAL
/
查询1 :
SELECT REGEXP_REPLACE (
LISTAGG (NAME, ',' ) WITHIN GROUP (ORDER BY 1),
'([^,]+)(,\1)+',
'\1'
) AS constant_sort
FROM names
<强> Results 强>:
| CONSTANT_SORT |
|---------------|
| AA,BA,C |
如果您想获取不同的元素,那么您可以使用DISTINCT
(根据Littlefoot's answer),或者您可以将值COLLECT
添加到用户定义的集合中,然后使用{ {3}}删除重复项。然后,您可以将此重复数据删除的集合传递给表集合表达式,并使用LISTAGG
来获取输出:
Oracle 11g R2架构设置:
CREATE TYPE StringList IS TABLE OF VARCHAR2(4000)
/
查询2 :
SELECT (
SELECT LISTAGG( column_value, ',' )
WITHIN GROUP ( ORDER BY ROWNUM )
FROM TABLE( n.unique_names )
) AS agg_names
FROM (
SELECT SET( CAST( COLLECT( name ORDER BY NAME ) AS StringList ) )
AS unique_names
FROM names
) n
<强> SET
function 强>:
| AGG_NAMES |
|-------------|
| A,AA,B,BA,C |
关于你的评论:
在涉及大量连接的更大查询的上下文中,鉴于我的begginers技能,我不知道如何实现此模型
例如,如果您的查询是:
SELECT REGEXP_REPLACE(
LISTAGG (PR.NAME, ',' ) WITHIN GROUP (ORDER BY 1),
'([^,]+)(,\1)+',
'\1'
) AS PRODUCERS,
other_column1,
other_column2
FROM table1 pr
INNER JOIN table2 t2
ON (pr.some_condition = t2.some_condition )
WHERE t2.some_other_condition = 'TRUE'
GROUP BY other_column1, other_column2
然后您可以将其更改为:
SELECT (
SELECT LISTAGG( COLUMN_VALUE, ',' ) WITHIN GROUP ( ORDER BY ROWNUM )
FROM TABLE( t.PRODUCERS )
) AS producers,
other_column1,
other_column2
FROM (
SELECT SET( CAST( COLLECT( PR.name ORDER BY PR.NAME ) AS StringList ) )
AS PRODUCERS,
other_column1,
other_column2
FROM table1 pr
INNER JOIN table2 t2
ON (pr.some_condition = t2.some_condition )
WHERE t2.some_other_condition = 'TRUE'
GROUP BY other_column1, other_column2
) t
答案 1 :(得分:0)
(我无法看到图片;公司政策)。
为什么不在应用LISTAGG之前删除重复的?像
这样的东西select listagg(x.distinct_name, ',') within group (order by 1) producers
from (select DISTINCT name distinct_name
from some_table
) x
答案 2 :(得分:0)
删除重复项的另一种方法是使用窗口函数和case
:
select listagg(case when seqnum = 1 then name end, ',') within group (order by 1) as producers
from (select . . .,
row_number() over (partition by name order by name) as seqnum
from . . .
) t
这确实需要修改查询的其余部分,但您仍然可以执行其余的聚合和计算。