复杂的SQL字符串解析

时间:2013-01-10 15:11:09

标签: sql-server string tsql

我在SQL Server表中有以下文本字段:

1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0
  1. 想要仅检索感叹号(!)之前的部分。因此对于1!1我只需要1,对于3!0我只需要3,对于23!0,我只需要23

  2. 还要检索感叹号后面的部分(!)。因此对于1!1我只需要1,对于3!0我只需要0,对于23!0,我只需要0

  3. 应将第1点和第2点都插入SQL Server表的单独列中。

3 个答案:

答案 0 :(得分:1)

我喜欢SQL Server的XML功能。这是解析数据的好方法。试试这个:

--Load the original string
DECLARE @string nvarchar(max) = '1!2,3!4,5!6,7!8,9!10';

--Turn it into XML
SET @string = REPLACE(@string,',','</SecondNumber></Pair><Pair><FirstNumber>') + '</SecondNumber></Pair>';
SET @string = '<Pair><FirstNumber>' + REPLACE(@string,'!','</FirstNumber><SecondNumber>');

--Show the new version of the string
SELECT @string AS XmlIfiedString;

--Load it into an XML variable
DECLARE @xml XML = @string;

--Now, First and Second Number from each pair...
SELECT
  Pairs.Pair.value('FirstNumber[1]','nvarchar(1024)') AS FirstNumber,
  Pairs.Pair.value('SecondNumber[1]','nvarchar(1024)') AS SecondNumber
FROM @xml.nodes('//*:Pair') Pairs(Pair);

上面的查询将字符串转换为XML,如下所示:

<Pair><FirstNumber>1</FirstNumber><SecondNumber>2</SecondNumber></Pair> ...

然后解析它以返回如下结果:

FirstNumber | SecondNumber
----------- | ------------
          1 |            2
          3 |            4
          5 |            6
          7 |            8
          9 |           10

答案 1 :(得分:0)

我完全赞同那些抱怨这类数据的人。 但事实是,我们通常无法控制我们的来源格式。

这是我的方法......

首先你需要一个tokeniser。这个非常有效(可能是最快的非CLR)。找到http://www.sqlservercentral.com/articles/Tally+Table/72993/

CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
        (@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE!  IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
     -- enough to cover VARCHAR(8000)
  WITH E1(N) AS (
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
                ),                          --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
 cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
                     -- for both a performance gain and prevention of accidental "overruns"
                 SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
                ),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
                 SELECT 1 UNION ALL
                 SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
                ),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
                 SELECT s.N1,
                        ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
                   FROM cteStart s
                )
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
 SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
        Item       = SUBSTRING(@pString, l.N1, l.L1)
   FROM cteLen l
;
GO

然后你就这样消耗它......

DECLARE @Wtf VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0'

SELECT   LEFT(Item, CHARINDEX('!', Item)-1)
        ,RIGHT(Item, CHARINDEX('!', REVERSE(Item))-1)
FROM [dbo].[DelimitedSplit8K](@Wtf, ',')

当然,发布的函数和解析逻辑可以集成到一个函数中。

答案 2 :(得分:0)

我同意将数据标准化是最好的方法。但是,这是解析数据的XML解决方案

DECLARE @str VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0'
    ,@xml XML

SET @xml  = CAST('<row><col>' + REPLACE(REPLACE(@str,'!','</col><col>'),',','</col></row><row><col>') + '</col></row>' AS XML)

SELECT  
     line.col.value('col[1]', 'varchar(1000)') AS col1
    ,line.col.value('col[2]', 'varchar(1000)') AS col2
FROM    @xml.nodes('/row') AS line(col)