使用SQL Server中的STRING_AGG获取唯一值

时间:2018-05-29 16:33:50

标签: sql sql-server sql-server-2017 string-aggregation

以下查询返回如下所示的结果:

SELECT 
    ProjectID, newID.value
FROM 
    [dbo].[Data] WITH(NOLOCK)  
CROSS APPLY 
    STRING_SPLIT([bID],';') AS newID  
WHERE 
    newID.value IN ('O95833', 'Q96NY7-2') 

结果:

ProjectID   value
---------------------
2           Q96NY7-2
2           O95833
2           O95833
2           Q96NY7-2
2           O95833
2           Q96NY7-2
4           Q96NY7-2
4           Q96NY7-2

使用新添加的STRING_AGG函数(在SQL Server 2017中),如以下查询所示,我可以在下面获得结果集。

SELECT 
    ProjectID,
    STRING_AGG( newID.value, ',') WITHIN GROUP (ORDER BY newID.value) AS 
NewField
FROM
    [dbo].[Data] WITH(NOLOCK)  
CROSS APPLY 
    STRING_SPLIT([bID],';') AS newID  
WHERE 
    newID.value IN ('O95833', 'Q96NY7-2')  
GROUP BY 
    ProjectID
ORDER BY 
    ProjectID

结果:

ProjectID   NewField
-------------------------------------------------------------
2           O95833,O95833,O95833,Q96NY7-2,Q96NY7-2,Q96NY7-2
4           Q96NY7-2,Q96NY7-2

我希望我的最终输出只包含如下所示的唯一元素:

ProjectID   NewField
-------------------------------
2           O95833, Q96NY7-2
4           Q96NY7-2

有关如何获得此结果的任何建议?如果需要,请随时从头开始优化/重新设计我的查询。

7 个答案:

答案 0 :(得分:3)

在组合结果之前,使用子查询中的DISTINCT关键字删除重复项:SQL Fiddle

SELECT 
ProjectID
,STRING_AGG(value, ',') WITHIN GROUP (ORDER BY value) AS 
NewField
from (
    select distinct ProjectId, newId.value 
    FROM [dbo].[Data] WITH(NOLOCK)  
    CROSS APPLY STRING_SPLIT([bID],';') AS newID  
    WHERE newID.value IN (   'O95833' , 'Q96NY7-2'  )  
) x
GROUP BY ProjectID
ORDER BY ProjectID

答案 1 :(得分:2)

这是我写的一个函数来回答 OP 标题: 欢迎改进!

CREATE OR ALTER FUNCTION [dbo].[fn_DistinctWords]
(
  @String NVARCHAR(MAX)  
)
RETURNS NVARCHAR(MAX)
WITH SCHEMABINDING
AS
BEGIN
  DECLARE @Result NVARCHAR(MAX);
  WITH MY_CTE AS ( SELECT Distinct(value) FROM STRING_SPLIT(@String, ' ')  )
  SELECT @Result = STRING_AGG(value, ' ') FROM MY_CTE
  RETURN @Result
END
GO

像这样使用:

SELECT dbo.fn_DistinctWords('One Two      Three Two One');

答案 2 :(得分:1)

您可以在用于distinct的子查询中使用apply

SELECT d.ProjectID,
       STRING_AGG(  newID.value, ',') WITHIN GROUP (ORDER BY newID.value) AS 
NewField
FROM [dbo].[Data] d CROSS APPLY
     (select distinct value
      from STRING_SPLIT(d.[bID], ';') AS newID 
     ) newID
WHERE newID.value IN (   'O95833' , 'Q96NY7-2'  ) 
group by projectid;

答案 3 :(得分:0)

正如@SeanLange在评论中指出的那样,这是一种可靠的方式来提取数据,但如果您 ,那么只需按照以下方式进行2次单独查询:

SELECT 
    ProjectID
    ,STRING_AGG( val, ',') WITHIN GROUP (ORDER BY val) AS NewField
FROM
(
    SELECT DISTINCT 
        ProjectID
        ,newID.value AS val
    FROM 
        [dbo].[Data] WITH(NOLOCK)  
        CROSS APPLY STRING_SPLIT([bID],';') AS newID  
    WHERE 
        newID.value IN ('O95833' , 'Q96NY7-2') 
) t
GROUP BY
    ProjectID

应该这样做。

答案 4 :(得分:0)

您可以对表进行清晰的显示,其中包含聚合值,甚至更简单:

Create Table Test (field1 varchar(1), field2 varchar(1));

go

Create View DistinctTest as (Select distinct field1, field2 from test group by field1,field2);

go

insert into Test Select 'A', '1';
insert into Test Select 'A', '2';
insert into Test Select 'A', '2';
insert into Test Select 'A', '2';
insert into Test Select 'D', '1';
insert into Test Select 'D', '1';

select string_agg(field1, ',')  from Test where field2 = '1';  /* duplicates: A,D,D */;

select string_agg(field1, ',')  from DistinctTest where field2 = '1';  /* no duplicates: A,D  */;

答案 5 :(得分:0)

STRING_AGG 获取唯一字符串的另一种可能性是在获取逗号分隔的字符串后执行这三个步骤:

  1. 拆分字符串 (STRING_SPLIT)
  2. 从分割中选择DISTINCT
  3. STRING_AGG 再次应用于单个键上的组的选择

示例:

(select STRING_AGG(CAST(value as VARCHAR(MAX)), ',') 
        from (SELECT distinct 1 single_key, value 
            FROM STRING_SPLIT(STRING_AGG(CAST(customer_division as VARCHAR(MAX)), ','), ',')) 
                q group by single_key) as customer_division

答案 6 :(得分:0)

Oracle(自 19c 版起)suports listagg (DISTINCT ...,但 Microsoft SQL Server 不太可能。