我在SQL Server表中有以下文本字段:
1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0
想要仅检索感叹号(!)之前的部分。因此对于1!1
我只需要1
,对于3!0
我只需要3
,对于23!0
,我只需要23
。
还要检索感叹号后面的部分(!)。因此对于1!1
我只需要1
,对于3!0
我只需要0
,对于23!0
,我只需要0
。
应将第1点和第2点都插入SQL Server表的单独列中。
答案 0 :(得分:1)
我喜欢SQL Server的XML功能。这是解析数据的好方法。试试这个:
--Load the original string
DECLARE @string nvarchar(max) = '1!2,3!4,5!6,7!8,9!10';
--Turn it into XML
SET @string = REPLACE(@string,',','</SecondNumber></Pair><Pair><FirstNumber>') + '</SecondNumber></Pair>';
SET @string = '<Pair><FirstNumber>' + REPLACE(@string,'!','</FirstNumber><SecondNumber>');
--Show the new version of the string
SELECT @string AS XmlIfiedString;
--Load it into an XML variable
DECLARE @xml XML = @string;
--Now, First and Second Number from each pair...
SELECT
Pairs.Pair.value('FirstNumber[1]','nvarchar(1024)') AS FirstNumber,
Pairs.Pair.value('SecondNumber[1]','nvarchar(1024)') AS SecondNumber
FROM @xml.nodes('//*:Pair') Pairs(Pair);
上面的查询将字符串转换为XML,如下所示:
<Pair><FirstNumber>1</FirstNumber><SecondNumber>2</SecondNumber></Pair> ...
然后解析它以返回如下结果:
FirstNumber | SecondNumber
----------- | ------------
1 | 2
3 | 4
5 | 6
7 | 8
9 | 10
答案 1 :(得分:0)
我完全赞同那些抱怨这类数据的人。 但事实是,我们通常无法控制我们的来源格式。
这是我的方法......
首先你需要一个tokeniser。这个非常有效(可能是最快的非CLR)。找到http://www.sqlservercentral.com/articles/Tally+Table/72993/
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
GO
然后你就这样消耗它......
DECLARE @Wtf VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0'
SELECT LEFT(Item, CHARINDEX('!', Item)-1)
,RIGHT(Item, CHARINDEX('!', REVERSE(Item))-1)
FROM [dbo].[DelimitedSplit8K](@Wtf, ',')
当然,发布的函数和解析逻辑可以集成到一个函数中。
答案 2 :(得分:0)
我同意将数据标准化是最好的方法。但是,这是解析数据的XML解决方案
DECLARE @str VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0'
,@xml XML
SET @xml = CAST('<row><col>' + REPLACE(REPLACE(@str,'!','</col><col>'),',','</col></row><row><col>') + '</col></row>' AS XML)
SELECT
line.col.value('col[1]', 'varchar(1000)') AS col1
,line.col.value('col[2]', 'varchar(1000)') AS col2
FROM @xml.nodes('/row') AS line(col)