从逗号或管道运算符字符串中删除重复项

时间:2017-03-21 04:36:19

标签: sql-server sql-server-2008 tsql

我已经研究了一段时间了,我找不到从SQL Server中以逗号分隔和管道分隔的字符串删除重复字符串的方法。

给出字符串

test1,test2,test1|test2,test3|test4,test4|test4

有谁知道你会如何归还test1,test2,test3,test4?

5 个答案:

答案 0 :(得分:4)

方法

以下方法可用于去除重定义的分隔值列表。

  1. 使用REPLACE()函数将不同的分隔符转换为相同的分隔符。
  2. 使用REPLACE()函数注入XML结束和打开标记以创建XML片段
  3. 使用CAST(expr AS XML)函数将上述片段转换为XML数据类型
  4. 使用OUTER APPLY应用表值函数nodes()将XML片段拆分为其组成的XML标记。这将返回单独行上的每个XML标记。
  5. 使用value()函数从XML标记中提取值,并使用指定的数据类型返回值。
  6. 在上述值之后附加逗号。
  7. 请注意,这些值在不同的行上返回。 DISTINCT关键字的使用现在删除重复的行(即值)。
  8. 使用FOR XML PATH('')子句将多行中的值连接成一行。
  9. 查询

    将上述方法用于查询形式:

    SELECT DISTINCT PivotedTable.PivotedColumn.value('.','nvarchar(max)') + ',' 
    FROM ( 
            -- This query returns the following in theDataXml column: 
            -- <tag>test1</tag><tag>test2</tag><tag>test1</tag><tag>test2</tag><tag>test3</tag><tag>test4</tag><tag>test4</tag><tag>test4</tag>
            -- i.e. it has turned the original delimited data into an XML fragment 
            SELECT 
              DataTable.DataColumn AS DataRaw 
            , CAST( 
                '<tag>' 
                -- First replace commas with pipes to have only a single delimiter 
                -- Then replace the pipe delimiters with a closing and opening tag 
                + replace(replace(DataTable.DataColumn, ',','|'), '|','</tag><tag>') 
                -- Add a final set of closing tags 
                + '</tag>' 
                AS XML) AS DataXml 
            FROM ( SELECT 'test1,test2,test1|test2,test3|test4,test4|test4' AS DataColumn) AS DataTable 
        ) AS x 
    OUTER APPLY DataXml.nodes('tag') AS PivotedTable(PivotedColumn) 
    -- Running the query without the following line will return the data in separate rows 
    -- Running the query with the following line returns the rows concatenated, i.e. it returns: 
    -- test1,test2,test3,test4, 
    FOR XML PATH('') 
    

    输入&amp;结果

    鉴于输入:

      

    test1,test2,test1 | test2,test3 | test4,test4 | test4

    上述查询将返回结果:

      

    test1,test2,test3,test4,

    注意最后的逗号。我会把它作为练习留给你去除。

    编辑:重复次数

    OP在评论中要求&#34; 我如何得到重复数?在一个单独的专栏&#34;。

    最简单的方法是使用上述查询但删除最后一行FOR XML PATH('')。然后,计算上述查询中SELECT表达式返回的所有值和不同值(即PivotedTable.PivotedColumn.value('.','nvarchar(max)'))。所有值的计数与不同值的计数之间的差异是重复值的计数。

    SELECT 
        COUNT(PivotedTable.PivotedColumn.value('.','nvarchar(max)'))            AS CountOfAllValues 
      , COUNT(DISTINCT PivotedTable.PivotedColumn.value('.','nvarchar(max)'))   AS CountOfUniqueValues 
        -- The difference of the previous two counts is the number of duplicate values 
      , COUNT(PivotedTable.PivotedColumn.value('.','nvarchar(max)')) 
        - COUNT(DISTINCT PivotedTable.PivotedColumn.value('.','nvarchar(max)')) AS CountOfDuplicateValues 
    FROM ( 
            -- This query returns the following in theDataXml column: 
            -- <tag>test1</tag><tag>test2</tag><tag>test1</tag><tag>test2</tag><tag>test3</tag><tag>test4</tag><tag>test4</tag><tag>test4</tag>
            -- i.e. it has turned the original delimited data into an XML fragment 
            SELECT 
              DataTable.DataColumn AS DataRaw 
            , CAST( 
                '<tag>' 
                -- First replace commas with pipes to have only a single delimiter 
                -- Then replace the pipe delimiters with a closing and opening tag 
                + replace(replace(DataTable.DataColumn, ',','|'), '|','</tag><tag>') 
                -- Add a final set of closing tags 
                + '</tag>' 
                AS XML) AS DataXml 
            FROM ( SELECT 'test1,test2,test1|test2,test3|test4,test4|test4' AS DataColumn) AS DataTable 
        ) AS x 
    OUTER APPLY DataXml.nodes('tag') AS PivotedTable(PivotedColumn) 
    

    对于上面显示的相同输入,此查询的输出为:

    CountOfAllValues CountOfUniqueValues CountOfDuplicateValues
    ---------------- ------------------- ----------------------
    8                4                   4
    

答案 1 :(得分:3)

您的问题的解决方案如下:

DECLARE @Data_String AS VARCHAR(1000), @Result as varchar(1000)=''
SET @Data_String = 'test1,test2,test1|test2,test3|test4,test4|test4'
SET @Data_String = REPLACE(@Data_String,'|',',')
SELECT @Result=@Result+col+',' from(
SELECT DISTINCT t.c.value('.','varchar(100)') col from(
SELECT cast('<A>'+replace(@Data_String,',','</A><A>')+'</A>' as     xml)col1)data cross apply col1.nodes('/A') as t(c))Data
SELECT LEFT(@Result,LEN(@Result)-1)

<强>结果

test1,test2,test3,test4

答案 2 :(得分:0)

    DECLARE @string AS VARCHAR(1000) 
    SET @string = 'test1,test2,test1|test2,test3|test4,test4|test4'
    SET @string = REPLACE(@string,'|',',')
    DECLARE @t TABLE (val VARCHAR(MAX)) 

    DECLARE @xml XML
    SET @xml = N'<root><r>' + REPLACE(@string, ',', '</r><r>') +         '</r></root>'
    INSERT INTO @t(val) SELECT r.value('.','VARCHAR(MAX)') as Item FROM         @xml.nodes('//root/r') AS RECORDS(r)
    ;WITH cte
    AS (SELECT ROW_NUMBER() OVER (PARTITION BY val ORDER BY val desc) RN
    FROM  @t)
    DELETE FROM cte
    WHERE  RN > 1

答案 3 :(得分:0)

尝试以下SQL脚本:

declare @List nvarchar(max)='test1,test2,test1|test2,test3|test4,test4|test4';
declare @Delimiter CHAR(1) =','
declare @XML AS XML
declare @result varchar(max)
set @List=Replace(@List,'|',',')
--Select @List

SET @XML = CAST(('<X>'+REPLACE(@List,@Delimiter ,'</X><X>')+'</X>') AS XML)
DECLARE @temp TABLE (Data nvarchar(100))
INSERT INTO @temp
SELECT N.value('.', 'nvarchar(100)') AS Data FROM @XML.nodes('X') AS T(N)
--SELECT distinct * FROM @temp

IF OBJECT_ID('tempdb..#temp') IS NOT NULL DROP TABLE #temp
Select distinct Data into #temp from @temp

SET @result = ''
select @result = @result + Data + ', ' from #temp
select SUBSTRING(@result, 0, LEN(@result))

答案 4 :(得分:0)

我只是尝试了以下脚本完美运行:

declare @List VARCHAR(MAX)='test1,test2,test1|test2,test3|test4,test4|test4'
declare @Delim CHAR=','
DECLARE @ParsedList TABLE
(
Item VARCHAR(MAX)
)
DECLARE @list1 VARCHAR(MAX), @Pos INT, @rList VARCHAR(MAX)
set @List=Replace(@List,'|',',')
SET @list = LTRIM(RTRIM(@list)) + @Delim
SET @pos = CHARINDEX(@delim, @list, 1)
WHILE @pos > 0
BEGIN
SET @list1 = LTRIM(RTRIM(LEFT(@list, @pos - 1)))
IF @list1 <> ''
INSERT INTO @ParsedList VALUES (CAST(@list1 AS VARCHAR(MAX)))
SET @list = SUBSTRING(@list, @pos+1, LEN(@list))
SET @pos = CHARINDEX(@delim, @list, 1)
END
SELECT @rlist = COALESCE(@rlist+',','') + item
FROM (SELECT DISTINCT Item FROM @ParsedList) t
Select @rlist