具有多个分隔符和保留分隔符的拆分字符串

时间:2014-08-18 18:04:43

标签: sql-server

我需要将字符串拆分为特定字边界的行。问题是我需要维护触发拆分的特定字边界,因为稍后,我想重新组合行并查看分隔符。我正在更新一个看起来像这样的现有函数:

DECLARE @Input VARCHAR(8000) = 'AC/DC, The Quick, Brown Fox'
DECLARE @Delimiters varchar(100) = '%[ ,-/]%'

;WITH 
    [Elements] AS
    (
        SELECT
            1 AS Position
            , 1 AS StartOffset
            , PATINDEX(@Delimiters, @Input) - 1 AS EndOffset
            , @Input AS Input
            , SUBSTRING(@Input, 1, ISNULL(NULLIF(PATINDEX(@Delimiters, @Input), 0) - 1, 8000)) AS Word
        UNION ALL
        SELECT 
            Position + 1 AS Position
            , EndOffset + 2 AS StartOffset
            , EndOffset + ISNULL(NULLIF(PATINDEX(@Delimiters, SUBSTRING(Input, EndOffset + 2, 8000)), 0), LEN(@Input) - EndOffset) AS EndOffset
            , Input
            , SUBSTRING(Input, EndOffset + 2, ISNULL(NULLIF(PATINDEX(@Delimiters, SUBSTRING(Input, EndOffset + 2, 8000)), 0) - 1, 8000)) AS Word
        FROM 
            [Elements]
        WHERE 
            EndOffset BETWEEN 1 AND LEN(@Input) - 1
    )
SELECT
    *
FROM
    [Elements]

给我:

+----------+-------------+-----------+-----------------------------+-------+
| Position | StartOffset | EndOffset |            Input            | Word  |
+----------+-------------+-----------+-----------------------------+-------+
|        1 |           1 |         2 | AC/DC, The Quick, Brown Fox | AC    |
|        2 |           4 |         5 | AC/DC, The Quick, Brown Fox | DC    |
|        3 |           7 |         6 | AC/DC, The Quick, Brown Fox |       |
|        4 |           8 |        10 | AC/DC, The Quick, Brown Fox | The   |
|        5 |          12 |        16 | AC/DC, The Quick, Brown Fox | Quick |
|        6 |          18 |        17 | AC/DC, The Quick, Brown Fox |       |
|        7 |          19 |        23 | AC/DC, The Quick, Brown Fox | Brown |
|        8 |          25 |        27 | AC/DC, The Quick, Brown Fox | Fox   |
+----------+-------------+-----------+-----------------------------+-------+

这很好地分解了它,但省略了" /"和","结果集中的行。我确实有一个我可以反对的数字表,我对如何实现这一点非常灵活。

我可以通过循环强行通过它,但这看起来太野蛮了。

1 个答案:

答案 0 :(得分:0)

嗯,你很亲密,你只需要在"拆分"是否在分隔符上。代码变得丑陋,但你不必暴力破解它。如果我真的在努力,我可能能够做到更漂亮"。我正在迅速建立你所创造的东西。

DECLARE @Input VARCHAR(8000) = ',AC/DC,,The Quick, Brown Fox-Hound'
DECLARE @Delimiters varchar(100) = '%[ ,-/]%'

;WITH 
    [Elements] AS
    (
        SELECT
            1 AS Position
            , 1 AS StartOffset
            , PATINDEX(@Delimiters, @Input) - 1 + case when  PATINDEX(@Delimiters, Left(@Input,1)) = 1 then 1 else 0 end AS EndOffset
            , @Input AS Input
            , SUBSTRING(@Input, 1, ISNULL(NULLIF(PATINDEX(@Delimiters, @Input), 0) - 1 +case when  PATINDEX(@Delimiters, Left(@Input,1)) = 1 then 1 else 0 end, 8000)) AS Word
        UNION ALL
        SELECT 
            Position +1 AS Position
            , EndOffset + 1 AS StartOffset
            , EndOffset + ISNULL(NULLIF(PATINDEX(@Delimiters, SUBSTRING(Input, EndOffset+1 , 8000)), 0) -  case when  PATINDEX(@Delimiters, SUBSTRING(Input, EndOffset+1 , 8000)) = 1 then 0 else 1 end  ,datalength(Input)-endoffset) EndOffset
            , Input
            , SUBSTRING(Input, EndOffset+ 1 , ISNULL(NULLIF(PATINDEX(@Delimiters, SUBSTRING(Input, EndOffset+1 , 8000)), 0) -  case when  PATINDEX(@Delimiters, SUBSTRING(Input, EndOffset+1 , 8000)) = 1 then 0 else 1 end  ,datalength(Input)-endoffset)) AS Word
        FROM 
            [Elements]
        WHERE 
            EndOffset BETWEEN 1 AND datalength(@Input)-1 
    )
SELECT
    *
FROM
    [Elements]