t-sql使用csv字符串查找特定值

时间:2012-09-05 09:19:27

标签: sql tsql parsing csv

我需要一些帮助SQL查询。我有一个列的值存储为逗号分隔值。

我需要编写一个查询,查找列中每个值中的第3个分隔项。

这可以在Select语句中执行此操作吗? 例如:ColumnValue:josh,Reg01,False,a0-t0,22/09/2010

所以我需要从上面的字符串中获取第3个值(即)False

4 个答案:

答案 0 :(得分:6)

@s是你的字符串......

select 
    SUBSTRING (@s,
    CHARINDEX(',',@s,CHARINDEX(',',@s)+1)+1,
    CHARINDEX(',',@s,CHARINDEX(',',@s,CHARINDEX(',',@s)+1)+1)
          -CHARINDEX(',',@s,CHARINDEX(',',@s)+1)-1)

或更普遍......

;with cte as 
(
    select 1 as Item, 1 as Start, CHARINDEX(',',@s, 1) as Split
    union all
    select cte.Item+1, cte.Split+1, nullif(CHARINDEX(',',@s, cte.Split+1),0) as Split
    from cte
    where cte.Split<>0  
)   
select SUBSTRING(@s, start,isnull(split,len(@s)+1)-start) 
from cte 
where Item = 3

现在正确存储您的数据:)

答案 1 :(得分:4)

试试这个(假设SQL Server 2005 +)

DECLARE @t TABLE(ColumnValue VARCHAR(50))
INSERT INTO @t(ColumnValue) SELECT 'josh,Reg01,False,a0-t0,22/09/2010'
INSERT INTO @t(ColumnValue) SELECT 'mango,apple,bannana,grapes'
INSERT INTO @t(ColumnValue) SELECT 'stackoverflow'

SELECT ThirdValue = splitdata
FROM(
        SELECT 
            Rn = ROW_NUMBER() OVER(PARTITION BY ColumnValue ORDER BY (SELECT 1))
            ,X.ColumnValue
            ,Y.splitdata 
        FROM
         (
            SELECT *,
            CAST('<X>'+REPLACE(F.ColumnValue,',','</X><X>')+'</X>' AS XML) AS xmlfilter FROM @t F
         )X
         CROSS APPLY
         ( 
            SELECT fdata.D.value('.','varchar(50)') AS splitdata 
            FROM X.xmlfilter.nodes('X') as fdata(D)
         ) Y
    )X WHERE X.Rn = 3

//结果

<强> ThirdValue

False
bannana

从您的问题来看,您使用的SQL Server版本也不是很清楚。如果您使用的是SQL SERVER 2000,则可以继续使用以下方法。

第1步:创建数字表

CREATE TABLE dbo.Numbers
(
   N INT NOT NULL PRIMARY KEY
);
GO

DECLARE @rows AS INT;
SET @rows = 1;

INSERT INTO dbo.Numbers VALUES(1);
WHILE(@rows <= 10000)
BEGIN
   INSERT INTO dbo.Numbers SELECT N + @rows FROM dbo.Numbers;
   SET @rows = @rows * 2;
END 

第2步:应用以下查询

DECLARE @t TABLE(ColumnValue VARCHAR(50))
INSERT INTO @t(ColumnValue) SELECT 'josh,Reg01,False,a0-t0,22/09/2010'
INSERT INTO @t(ColumnValue) SELECT 'mango,apple,bannana,grapes'
INSERT INTO @t(ColumnValue) SELECT 'stackoverflow'

--Declare a table variable to put the identity column and store the indermediate results
DECLARE @tempT TABLE(Id INT IDENTITY,ColumnValue VARCHAR(50),SplitData VARCHAR(50))

-- Insert the records into the table variable
INSERT INTO @tempT
SELECT  
    ColumnValue
    ,SUBSTRING(ColumnValue, Numbers.N,CHARINDEX(',', ColumnValue + ',', Numbers.N) - Numbers.N) AS splitdata 
FROM @t 
JOIN Numbers ON Numbers.N <= DATALENGTH(ColumnValue) + 1  
AND SUBSTRING(',' + ColumnValue, Numbers.N, 1) = ','  

--Project the filtered records

SELECT ThirdValue = X.splitdata
FROM
--The co-related subquery does the ROW_NUMBER() OVER(PARTITION BY ColumnValue)
(SELECT 
  Rn = (SELECT COUNT(*) 
        FROM @tempT t2 
        WHERE t2.ColumnValue=t1.ColumnValue 
        AND t2.Id<=t1.Id)
 ,t1.ColumnValue
 ,t1.splitdata
FROM @tempT t1)X
WHERE X.Rn =3

- 结果

<强> ThirdValue

False
bannana

您也可以使用Master..spt_Values作为您的号码表

DECLARE @t TABLE(ColumnValue VARCHAR(50))
INSERT INTO @t(ColumnValue) SELECT 'josh,Reg01,False,a0-t0,22/09/2010'
INSERT INTO @t(ColumnValue) SELECT 'mango,apple,bannana,grapes'
INSERT INTO @t(ColumnValue) SELECT 'stackoverflow'

--Declare a table variable to put the identity column and store the indermediate results
DECLARE @tempT TABLE(Id INT IDENTITY,ColumnValue VARCHAR(50),SplitData VARCHAR(50))

-- Insert the records into the table variable
INSERT INTO @tempT
SELECT  
    ColumnValue
    ,SUBSTRING(ColumnValue, Number ,CHARINDEX(',', ColumnValue + ',', Number ) - Number) AS splitdata 
FROM @t 
JOIN master..spt_values ON Number <= DATALENGTH(ColumnValue) + 1  AND type='P'
AND SUBSTRING(',' + ColumnValue, Number , 1) = ','  

--Project the filtered records
SELECT ThirdValue = X.splitdata
FROM
--The co-related subquery does the ROW_NUMBER() OVER(PARTITION BY ColumnValue)
(SELECT 
  Rn = (SELECT COUNT(*) 
        FROM @tempT t2 
        WHERE t2.ColumnValue=t1.ColumnValue 
        AND t2.Id<=t1.Id)
 ,t1.ColumnValue
 ,t1.splitdata
FROM @tempT t1)X
WHERE X.Rn =3

您可以从

了解这一点

1)What is the purpose of system table table master..spt_values and what are the meanings of its values?

2)Why (and how) to split column using master..spt_values?

答案 2 :(得分:2)

你真的需要像String.Split(',')(2)这样的东西,遗憾的是它不存在于SQL中但this可能对你很有帮助

答案 3 :(得分:1)

您可以使用此解决方案和其他解决方案进行一些测试,但我相信在这种情况下使用XML几乎总能为您提供最佳性能并确保更少编码:

DECLARE @InPutCSV NVARCHAR(2000)= 'josh,Reg01,False,a0-t0,22/09/2010'
DECLARE @ValueIndexToGet INT=3
DECLARE @XML XML =  CAST ('<d>' + REPLACE(@InPutCSV, ',', '</d><d>') + '</d>' AS XML);

WITH CTE(RecordNumber,Value) AS
(
     SELECT  ROW_NUMBER() OVER(ORDER BY T.v.value('.', 'NVARCHAR(100)') DESC) AS RecordNumber
             ,T.v.value('.', 'NVARCHAR(100)') AS Value
     FROM @XML.nodes('/d') AS T(v)
)
SELECT Value
FROM CTE WHERE RecordNumber=@ValueIndexToGet

我可以确认从CSV字符串中获取100 000个值需要1秒钟。