Sql Query加入逗号分隔值

时间:2017-08-26 00:31:03

标签: sql sql-server tsql sql-server-2008-r2

我有一个包含复合键和逗号分隔值的表。我需要为每个逗号分隔元素将单行拆分为一行。我已经看到类似的问题和类似的答案,但无法将它们转化为我自己的解决方案。

我正在运行SQL Server 2008 R2。

| Key Part 1 | Key Part 2 | Key Part 3 | Values        |
|------------------------------------------------------|
| A          | A          | A          | PDE,PPP,POR   |
| A          | A          | B          | PDE,XYZ       |
| A          | B          | A          | PDE,RRR       |
|------------------------------------------------------|

我需要这个作为输出

| Key Part 1 | Key Part 2 | Key Part 3 | Values        | Sequence   |
|-------------------------------------------------------------------|
| A          | A          | A          | PDE           | 0          |
| A          | A          | A          | PPP           | 1          | 
| A          | A          | A          | POR           | 2          |
| A          | A          | B          | PDE           | 0          |
| A          | A          | B          | XYZ           | 1          |
| A          | B          | A          | PDE           | 0          |
| A          | B          | A          | RRR           | 1          |
|-------------------------------------------------------------------|

由于

杰夫

3 个答案:

答案 0 :(得分:4)

如果您没有或不需要Split / Parse UDF

,这是一个简单的内联方法

示例

Select A.[Key Part 1]
      ,A.[Key Part 2]
      ,A.[Key Part 3]
      ,B.*
 From YourTable A
 Cross Apply (
                Select [Values]   = LTrim(RTrim(X2.i.value('(./text())[1]', 'varchar(max)')))
                      ,[Sequence] = Row_Number() over (Order By (Select null))-1
                From  (Select x = Cast('<x>' + replace(A.[Values],',','</x><x>')+'</x>' as xml)) X1 
                Cross Apply x.nodes('x') X2(i)
             ) B

<强>返回

enter image description here

  

编辑 - 如果打开到表格值函数

查询看起来像这样

Select A.[Key Part 1]
      ,A.[Key Part 2]
      ,A.[Key Part 3]
      ,[Values] = B.RetVal
      ,[Sequence] = B.RetSeq-1
 From @YourTable A
 Cross Apply [dbo].[udf-Str-Parse-8K](A.[Values],',') B

感兴趣的UDF

CREATE FUNCTION [dbo].[udf-Str-Parse-8K] (@String varchar(max),@Delimiter varchar(25))
Returns Table 
As
Return (  
    with   cte1(N)   As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
           cte2(N)   As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 a,cte1 b,cte1 c,cte1 d) A ),
           cte3(N)   As (Select 1 Union All Select t.N+DataLength(@Delimiter) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter)) = @Delimiter),
           cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter,@String,s.N),0)-S.N,8000) From cte3 S)

    Select RetSeq = Row_Number() over (Order By A.N)
          ,RetVal = LTrim(RTrim(Substring(@String, A.N, A.L)))
    From   cte4 A
);
--Orginal Source http://www.sqlservercentral.com/articles/Tally+Table/72993/
--Select * from [dbo].[udf-Str-Parse-8K]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse-8K]('John||Cappelletti||was||here','||')

答案 1 :(得分:0)

如果所有CSV值都正好是3个字符(就像您在测试数据中一样),则可以通过预先创建所需的确切行数(而不是为每个行创建一行)来以极其高效的方式使用计数表。用于查找分隔符的字符)...因为您已经知道分隔符位置。

在这种情况下,我会使用计数功能,但您也可以使用固定的计数表。

tfn_Tally函数的代码......

 SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
GO
CREATE FUNCTION dbo.tfn_Tally
/* ============================================================================
07/20/2017 JL, Created. Capable of creating a sequense of rows 
                ranging from -10,000,000,000,000,000 to 10,000,000,000,000,000
============================================================================ */
(
    @NumOfRows BIGINT,
    @StartWith BIGINT 
)
RETURNS TABLE WITH SCHEMABINDING AS 
RETURN
    WITH 
        cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),   -- 10 rows
        cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),                             -- 100 rows
        cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),                             -- 10,000 rows
        cte_n4 (n) AS (SELECT 1 FROM cte_n3 a CROSS JOIN cte_n3 b),                             -- 100,000,000 rows
        cte_Tally (n) AS (
            SELECT TOP (@NumOfRows)
                (ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1) + @StartWith
            FROM 
                cte_n4 a CROSS JOIN cte_n4 b                                                    -- 10,000,000,000,000,000 rows
            )
    SELECT 
        t.n
    FROM 
        cte_Tally t;
GO

如何在解决方案中使用它......

-- create some test data...
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL 
DROP TABLE #TestData;

CREATE TABLE #TestData (
    KeyPart1 CHAR(1),
    KeyPart2 CHAR(1),
    KeyPart3 CHAR(1),
    [Values] varchar(50) 
    );

INSERT #TestData (KeyPart1, KeyPart2, KeyPart3, [Values]) VALUES 
    ('A', 'A', 'A', 'PDE,PPP,POR'),
    ('A', 'A', 'B', 'PDE,XYZ'),
    ('A', 'B', 'A', 'PDE,RRR,XXX,YYY,ZZZ,AAA,BBB,CCC');

--==========================================================

-- solution query...
SELECT 
    td.KeyPart1, 
    td.KeyPart2, 
    td.KeyPart3, 
    x.SplitValue,
    [Sequence] = t.n
FROM
    #TestData td
    CROSS APPLY dbo.tfn_Tally(LEN(td.[Values]) - LEN(REPLACE(td.[Values], ',', '')) + 1, 0) t
    CROSS APPLY ( VALUES (SUBSTRING(td.[Values], t.n * 4 + 1, 3)) ) x (SplitValue);

结果......

KeyPart1 KeyPart2 KeyPart3 SplitValue Sequence
-------- -------- -------- ---------- --------------------
A        A        A        PDE        0
A        A        A        PPP        1
A        A        A        POR        2
A        A        B        PDE        0
A        A        B        XYZ        1
A        B        A        PDE        0
A        B        A        RRR        1
A        B        A        XXX        2
A        B        A        YYY        3
A        B        A        ZZZ        4
A        B        A        AAA        5
A        B        A        BBB        6
A        B        A        CCC        7

如果假设所有csv元素都是字符数不正确,那么使用传统的基于标签的分割器会更好。在这种情况下,我的推荐是DelimitedSplit8K written by Jeff Moden

在这种情况下,解决方案查询看起来像这样......

SELECT 
    td.KeyPart1, 
    td.KeyPart2, 
    td.KeyPart3, 
    SplitValue = dsk.Item,
    [Sequence] = dsk.ItemNumber - 1
FROM
    #TestData td
    CROSS APPLY dbo.DelimitedSplit8K(td.[Values], ',') dsk;

Ann结果......

KeyPart1 KeyPart2 KeyPart3 SplitValue Sequence
-------- -------- -------- ---------- --------------------
A        A        A        PDE        0
A        A        A        PPP        1
A        A        A        POR        2
A        A        B        PDE        0
A        A        B        XYZ        1
A        B        A        PDE        0
A        B        A        RRR        1
A        B        A        XXX        2
A        B        A        YYY        3
A        B        A        ZZZ        4
A        B        A        AAA        5
A        B        A        BBB        6
A        B        A        CCC        7
HTH,Jason

答案 2 :(得分:0)

- 创建表

Create table YourTable 
(
p1 varchar(50),
p2 varchar(50),
p3 varchar(50),
pval varchar(50)
)
go

- 插入数据

    insert into YourTable values ('A','A','A','PDE,PPP,POR'),
('A','A','B','PDE,XYZ'),('A','B','A','PDE,RRR')

    go

- 查看样本数据

SELECT p1, p2, p3 , pval FROM YourTable
go

- 必填结果

SELECT p1,p2,p3,  LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) as Value1 , ROW_NUMBER() OVER(PARTITION BY id ORDER BY id ASC)-1 AS SequenceNo
FROM  
(SELECT ROW_NUMBER() over (order by (SELECT NULL)) AS ID, p1,p2,p3, pval, CAST ('<M>' + REPLACE(pval, ',', '</M><M>') + '</M>' AS XML) AS Data from YourTable 
) AS A 
CROSS APPLY Data.nodes ('/M') AS Split(a)
go

- 删除Temp创建的表

drop table YourTable
go