通过分隔符(|)分隔列中的文本,然后将所有分隔的值堆叠到单个列中而不重复

时间:2015-05-26 05:34:15

标签: sql sql-server delimiter

我有一个数据报告,输入到SQL Server中,并将作业卡捆绑在一起 我正在创建一个关于此SQL表的报告,该报告查看excel报告并检查哪些工作卡丢失 到目前为止,我编写了一个手动方法来修复sql表中的数据,以便使用文本到列来取消作业卡,然后堆叠列来创建一个巨大的列,但是有一个方法可以自动化它Sql server。
例: [第1列中的每一行都是一行]

Column 1
A437|Bb7772|d763ch
D444r7|Z71|
A37|Bc7772|766ch

需要看起来像这样:

Column 1
A437
Bb7772
d763ch
D444r7
Z71
A37
Bc7772
766ch

创建新列后,我还会删除所有重复项(如果有的话)。

很抱歉这个问题,但老实说,我甚至不知道从哪个开始从SQL开始拆分列。
我想我可以使用UNION all函数将值堆叠到一个新列中。

哦,更复杂的是,分组的工作卡数量是可变的(可能只是两个聚在一起,可能多达6个,可能只是一张工作卡)。

我在一个角落里,或者我甚至都不会打扰。是的,我公司的工作卡组织方法很糟糕。

5 个答案:

答案 0 :(得分:1)

来自同一主题的my DBA post

利用Jeff Moden的Tally-Ho!来自here的CSV分割器:

CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
        (@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE!  IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
     -- enough to cover VARCHAR(8000)
WITH E1(N) AS (
           SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
           SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
           SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
       ),                          --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
                     -- for both a performance gain and prevention of accidental "overruns"
            SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() 
                                                        OVER (ORDER BY (SELECT NULL)) FROM E4
        ),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just
                     -- once for each delimiter)
            SELECT 1 UNION ALL
            SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
        ),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
            SELECT s.N1,
                   ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
            FROM cteStart s
        )
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final
     -- element when no delimiter is found.
 SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
        Item       = SUBSTRING(@pString, l.N1, l.L1)
   FROM cteLen l
;
go

我们可以将解决方案编写为适用于Jeff的功能和类似的支点:

with data as (
    select Code,Location,Quantity,Store from ( values
        ('L698-W-EA',          NULL,                                      2, 'A')
       ,('L82009-EA',          'A1K2, A1N2, C4Y3, CBP2',                  2, 'A')
       ,('L80401-A-EA',        'A1S2, SHIP, R2F1, CBP5, BRP, BRP1-20',    17,'A')
       ,('CWD2132W-BOX-25PK',  'A-AISLE',                                 1, 'M')
       ,('GM22660003-EA',      'B1K2',                                    1, 'M')
    )data(Code,Location,Quantity,Store)
)
,shredded as (
    select Code,Location,Quantity,Store,t.*
    from data
    cross apply [dbo].[DelimitedSplit8K](data.Location,',') as t
)
select 
    pvt.Code,pvt.Quantity,pvt.Store
   ,cast(isnull(pvt.[1],' ') as varchar(8)) as Loc1
   ,cast(isnull(pvt.[2],' ') as varchar(8)) as Loc2
   ,cast(isnull(pvt.[3],' ') as varchar(8)) as Loc3
   ,cast(isnull(pvt.[4],' ') as varchar(8)) as Loc4
   ,cast(isnull(pvt.[5],' ') as varchar(8)) as Loc5 
   ,cast(isnull(pvt.[6],' ') as varchar(8)) as Loc6
from shredded
pivot (max(Item) for ItemNumber in ([1],[2],[3],[4],[5],[6])) pvt;
;
go

产生这个:

Code              Quantity    Store Loc1     Loc2     Loc3     Loc4     Loc5     Loc6
----------------- ----------- ----- -------- -------- -------- -------- -------- --------
L698-W-EA         2           A                                                   
L82009-EA         2           A     A1K2      A1N2     C4Y3     CBP2              
L80401-A-EA       17          A     A1S2      SHIP     R2F1     CBP5     BRP      BRP1-20
CWD2132W-BOX-25PK 1           M     A-AISLE                                       
GM22660003-EA     1           M     B1K2                                          

答案 1 :(得分:1)

试试这个

功能

 CREATE  FUNCTION [dbo].[fn_Split](@text varchar(8000), @delimiter varchar(20))
    RETURNS @Strings TABLE
    (   
      position int IDENTITY PRIMARY KEY,
      value varchar(8000)  
    )
    AS
    BEGIN

    DECLARE @index int
    SET @index = -1

    WHILE (LEN(@text) > 0)
      BEGIN 
        SET @index = CHARINDEX(@delimiter , @text) 
        IF (@index = 0) AND (LEN(@text) > 0) 
          BEGIN  
            INSERT INTO @Strings VALUES (@text)
              BREAK 
          END 
        IF (@index > 1) 
          BEGIN  
            INSERT INTO @Strings VALUES (LEFT(@text, @index - 1))  
            SET @text = RIGHT(@text, (LEN(@text) - @index)) 
          END 
        ELSE
          SET @text = RIGHT(@text, (LEN(@text) - @index))
        END
      RETURN
    END

查询

select value from fn_split( (select stuff(( select '|'+Column1 from table1 for xml path('')),1,1,'')) ,'|')

答案 2 :(得分:1)

Sql Server有许多字符串拆分功能 当你有一个简短的小字符串列表时,大多数都表现得更好 您可以阅读this article以了解某些主要解决方案之间的性能测试。

对于这个例子,我将使用该文章中的Jeff Moden分离器功能,但您应该选择最适合您需求的功能。

--  Create the sample data
CREATE TABLE MyTable (Column1 varchar(max))
INSERT INTO MyTable VALUES 
('A437|Bb7772|d763ch'),
('D444r7|Z71|'),
('A37|Bc7772|766ch')

-- Create the split function
CREATE FUNCTION dbo.SplitStrings
(
   @List NVARCHAR(MAX),
   @Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING AS
RETURN
  WITH E1(N)        AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 
                         UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 
                         UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
       E2(N)        AS (SELECT 1 FROM E1 a, E1 b),
       E4(N)        AS (SELECT 1 FROM E2 a, E2 b),
       E42(N)       AS (SELECT 1 FROM E4 a, E2 b),
       cteTally(N)  AS (SELECT 0 UNION ALL SELECT TOP (DATALENGTH(ISNULL(@List,1))) 
                         ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E42),
       cteStart(N1) AS (SELECT t.N+1 FROM cteTally t
                         WHERE (SUBSTRING(@List,t.N,1) = @Delimiter OR t.N = 0))
  SELECT Item = SUBSTRING(@List, s.N1, ISNULL(NULLIF(CHARINDEX(@Delimiter,@List,s.N1),0)-s.N1,8000))
    FROM cteStart s;

现在,对于实际的解决方案:

DECLARE @AllValues varchar(max)

-- Concatenate all the values in Column1 to a single string. 
-- the replace function is to prevent a double delimiter in case of the value of any row begins or ends with the delimiter
SELECT @AllValues = REPLACE(STUFF((
   SELECT '|'+ Column1
   FROM MyTable 
   FOR XML PATH('')
 ), 1, 1, ''), '||', '|')

-- These are the distinct values:
SELECT DISTINCT Item
FROM dbo.SplitStrings(@AllValues, '|')

现在,假设此表只有一列,您可以这样做:

-- get the values in the column
SELECT @AllValues = REPLACE(STUFF((
   SELECT '|'+ Column1
   FROM MyTable 
   FOR XML PATH('')
 ), 1, 1, ''), '||', '|')

-- delete all rows from the table
TRUNCATE TABLE MyTable 

-- insert new values
INSERT INTO MyTable
SELECT DISTINCT Item
FROM dbo.SplitStrings(@AllValues, '|')

Read here找出我选择截断表而不是删除

的原因

答案 3 :(得分:1)

DECLARE @t table(id int identity(1,1), name varchar(100))
INSERT @t VALUES
('A437|Bb7772|d763ch'),
('D444r7|Z71'),
('A37|Bc7772|766ch')

;WITH Value AS
(
     SELECT row_number() over(order by id) rn,t.c.value('.', 'VARCHAR(2000)') name
     FROM (
         SELECT id, x = CAST('<t>' + 
               REPLACE(name, '|', '</t><t>') + '</t>' AS XML)
         FROM @t
     ) a
     CROSS APPLY x.nodes('/t') t(c)
)
SELECT DISTINCT name
FROM Value 

答案 4 :(得分:0)

如果您的Column1总是像'%|%|%'那样使用此查询:

SELECT part 
FROM (
    SELECT LEFT(column1, CHARINDEX('|', column1, 0) - 1) part
    FROM t
    UNION 
    SELECT SUBSTRING(column1, CHARINDEX('|', column1, 0) + 1, CHARINDEX('|', column1, CHARINDEX('|', column1, 0) + 1) - CHARINDEX('|', column1, 0) - 1)
    FROM t
    UNION 
    SELECT RIGHT(column1, CHARINDEX('|', REVERSE(column1), 0) - 1)
    FROM t) parts
WHERE 
    part <> ''