在以空格分隔的字段中查找匹配值

时间:2019-02-11 12:38:02

标签: sql sql-server delimited-text

在SQL Server中,我有一个字段,其中包含定界数据(按空格)。

例如

recid| Delimited data field
1| 1 2 3 4 5
2| 1 2 3 3 5
3| 1 1 1 1 1

我需要遍历数据库中的所有记录并询问定界数据字段,并将数据的第三部分和第四部分相互比较,如果它们匹配,则返回recid和整个定界字段。

因此,在我的示例记录中,记录2和3具有匹配的数据部分,因此它将返回:-

2|1 2 3 3 5
3|1 1 1 1 1

因为3 3匹配,因为1 1。

谢谢。

6 个答案:

答案 0 :(得分:2)

如果始终为1位数字且格式相同,则可以尝试如下操作。

select * from @table
where SUBSTRING([data], 5, 1) = SUBSTRING([data], 7, 1)

如果不是(数字不是一位数字),则可以尝试如下操作。

;WITH cte 
     AS (SELECT F1.recid, 
                F1.[data], 
                O.splitdata, 
                Row_number() 
                  OVER( 
                    partition BY recid 
                    ORDER BY (SELECT 1)) rn 
         FROM   (SELECT *, 
                        Cast('<X>' + Replace(F.data, ' ', '</X><X>') + '</X>' AS 
                             XML) 
                        AS 
                                xmlfilter 
                 FROM   @table F)F1 
                CROSS apply (SELECT fdata.d.value('.', 'varchar(50)') AS 
                                    splitdata 
                             FROM   f1.xmlfilter.nodes('X') AS fdata(d)) O) 
SELECT c1.recid, 
       c1.data 
FROM   cte c1 
       INNER JOIN cte c2 
               ON c1.recid = c2.recid 
                  AND c1.rn = 3 
                  AND c2.rn = 4 
                  AND c1.splitdata = c2.splitdata 
GROUP  BY c1.recid, 
          c1.data 

Online Demo

答案 1 :(得分:0)

需要拆分数据,给出行号,然后进行比较。

架构:

SELECT *  INTO #TAB FROM (
SELECT 1, '1 2 3 4 5' UNION ALL
SELECT 2, '1 2 3 3 5' UNION ALL
SELECT 3, '1 1 1 1 1'  
)A (recid , Delimited_data_field)

解决方案:

;WITH CTE
AS (
    SELECT recid
        ,Delimited_data_field
        ,ROW_NUMBER() OVER (PARTITION BY recid ORDER BY (SELECT 1)) RNO
        ,splt.X.value('.', 'INT') VAL
    FROM (
        SELECT recid
            ,Delimited_data_field
            ,CAST('<M>' + REPLACE(Delimited_data_field, ' ', '</M><M>') + '</M>' AS XML) DATA
        FROM #TAB
        ) A
    CROSS APPLY A.DATA.nodes('/M') splt(x)
    )
SELECT C.recid
    ,C2.Delimited_data_field
FROM CTE C
INNER JOIN CTE C2 ON C.recid = C2.recid AND C.RNO = 3 AND C2.RNO = 4
AND C.VAL = C2.VAL 

结果:

recid   Delimited_data_field
2       1 2 3 3 5
3       1 1 1 1 1

答案 2 :(得分:0)

您的问题分为两个部分,找到第n个拆分,然后进行比较。您的第一种方法应该是解决问题,直到找到可以完成工作的内置函数。 这是拆分和外部比较后内部查询返回的一种方法:

SELECT recid,Delimited from (
        SELECT recid,Delimited, SUBSTRING(Delimited, 
              charindex(' ', Delimited, (charindex(' ', Delimited, 1))+2)+1,1) 
              third, SUBSTRING(Delimited, charindex(' ',Delimited, 
              (charindex(' ', Delimited, 1))+3)+1,1) 
              fourth FROM YourTable) tr
        WHERE third = fourth

看到简单的substringcharindex就可以完成工作。

答案 3 :(得分:0)

这是另一种解决方案。

我对该链接(T-SQL: Opposite to string concatenation - how to split string into multiple records)中的split函数进行了一些调整,以使其在您的情况下有用。

这是功能。

CREATE FUNCTION dbo.SplitAndGetNumberAt (@sep char(1), @s varchar(512), @pos int)
RETURNS INT
BEGIN
declare @val as varchar(10);

WITH Pieces(pn, start, stop) AS (
    SELECT 1, 1, CHARINDEX(@sep, @s)
    UNION ALL
    SELECT pn + 1, stop + 1, CHARINDEX(@sep, @s, stop + 1)
    FROM Pieces
    WHERE stop > 0
)
SELECT @val = SUBSTRING(@s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END)
FROM Pieces where pn = @pos;

RETURN @val
END

现在,您可以使用此功能获取数字的第3位和第4位,并轻松进行比较。

select recid, deldata
from so1
where dbo.SplitAndGetNumberAt (' ', deldata, 3) = dbo.SplitAndGetNumberAt (' ', deldata, 4)

希望这会有所帮助。

答案 4 :(得分:0)

如果您具有SQL Server 2016或更高版本,则可以尝试使用OPENJSON()来分割输入数据的一种方法。这里的重要部分是事实,当OPENJSON解析JSON数组时,JSON文本中元素的索引将作为键(从0开始)返回。

输入:

CREATE TABLE #Table (
   RecId int,
   Data varchar(max)
)
INSERT INTO #Table
   (RecId, Data)
VALUES 
   (1, '1 2 3 4 5'),
   (2, '1 2 3 3 5'),
   (3, '1 1 1 1 1')

声明:

SELECT 
   t.RecId,
   t.Data
FROM #Table t
CROSS APPLY (SELECT [value] FROM OPENJSON('["' +  REPLACE(t.Data,' ','","') + '"]') WHERE [key] = 2) j3
CROSS APPLY (SELECT [value] FROM OPENJSON('["' +  REPLACE(t.Data,' ','","') + '"]') WHERE [key] = 3) j4
WHERE j3.[value] = j4.[value]

输出:

RecId   Data
2       1 2 3 3 5
3       1 1 1 1 1

答案 5 :(得分:0)

只是为了好玩,有点疯狂的编码:

DECLARE @Table Table (
    recid               INT,
    DelimitedDataField  VARCHAR(32)
)

INSERT @Table (recid, DelimitedDataField)
VALUES
    (1, '1 2 3 4 5'),
    (2, '1 2 3 3 5'),
    (3, '1 1 1 1 1')

SELECT *
FROM @Table
WHERE
SUBSTRING (
    STUFF(
        STUFF(
            DelimitedDataField + ' - - -',
            1,
            CHARINDEX(' ', DelimitedDataField + ' - - -'),
            ''
        ),
        1,
        CHARINDEX(' ', STUFF(
                        DelimitedDataField + ' - - -',
                        1,
                        CHARINDEX(' ', DelimitedDataField + ' - - -'), '')
                 ),
        ''),
    1,
    CHARINDEX(' ', STUFF(
        STUFF(
            DelimitedDataField + ' - - -',
            1,
            CHARINDEX(' ', DelimitedDataField + ' - - -'),
            ''
        ),
        1,
        CHARINDEX(' ', STUFF(
                        DelimitedDataField + ' - - -',
                        1,
                        CHARINDEX(' ', DelimitedDataField + ' - - -'), '')
                 ),
        '')
        )
) = 
SUBSTRING (
        STUFF(
            STUFF(
                STUFF(
                    DelimitedDataField + ' - - -',
                    1,
                    CHARINDEX(' ', DelimitedDataField + ' - - -'),
                    ''
                ),
                1,
                CHARINDEX(' ', STUFF(
                                DelimitedDataField + ' - - -',
                                1,
                                CHARINDEX(' ', DelimitedDataField + ' - - -'), '')
                         ),
                ''),
            1,
            CHARINDEX(' ', STUFF(
                STUFF(
                    DelimitedDataField + ' - - -',
                    1,
                    CHARINDEX(' ', DelimitedDataField + ' - - -'),
                    ''
                ),
                1,
                CHARINDEX(' ', STUFF(
                                DelimitedDataField + ' - - -',
                                1,
                                CHARINDEX(' ', DelimitedDataField + ' - - -'), '')
                         ),
                '')
            ),
            ''
        ),
        1,
        CHARINDEX(' ',      STUFF(
            STUFF(
                STUFF(
                    DelimitedDataField + ' - - -',
                    1,
                    CHARINDEX(' ', DelimitedDataField + ' - - -'),
                    ''
                ),
                1,
                CHARINDEX(' ', STUFF(
                                DelimitedDataField + ' - - -',
                                1,
                                CHARINDEX(' ', DelimitedDataField + ' - - -'), '')
                         ),
                ''),
            1,
            CHARINDEX(' ', STUFF(
                STUFF(
                    DelimitedDataField + ' - - -',
                    1,
                    CHARINDEX(' ', DelimitedDataField + ' - - -'),
                    ''
                ),
                1,
                CHARINDEX(' ', STUFF(
                                DelimitedDataField + ' - - -',
                                1,
                                CHARINDEX(' ', DelimitedDataField + ' - - -'), '')
                         ),
                '')
            ),
            ''
        ))
)

AND SUBSTRING (
    STUFF(
        STUFF(
            DelimitedDataField + ' - - -',
            1,
            CHARINDEX(' ', DelimitedDataField + ' - - -'),
            ''
        ),
        1,
        CHARINDEX(' ', STUFF(
                        DelimitedDataField + ' - - -',
                        1,
                        CHARINDEX(' ', DelimitedDataField + ' - - -'), '')
                 ),
        ''),
    1,
    CHARINDEX(' ', STUFF(
        STUFF(
            DelimitedDataField + ' - - -',
            1,
            CHARINDEX(' ', DelimitedDataField + ' - - -'),
            ''
        ),
        1,
        CHARINDEX(' ', STUFF(
                        DelimitedDataField + ' - - -',
                        1,
                        CHARINDEX(' ', DelimitedDataField + ' - - -'), '')
                 ),
        '')
        )
) <>'-'