使用T-SQL将值拆分为多行进行故障排除

时间:2011-11-22 10:23:37

标签: tsql

我有一个Sql Server 2K8 R2 DB,其表格的列包含多个值,以(char 13和char 10)分隔。

我正在构建一个脚本,以便在正确规范化的架构中导入数据。

我的源表包含以下内容:

ID   |    Value
________________
1    |    line 1
          line 2
________________
2    |    line 3
________________
3    |    line 4
          line 5
          line 6
________________

等等。

[edit] 仅供参考,Id为整数且值为nvarchar(3072) [/ edit]

我想要的是查询表格以输出像这样的东西:

ID   |    Value
________________
1    |    line 1
________________
1    |    line 2
________________
2    |    line 3
________________
3    |    line 4
________________
3    |    line 5
________________
3    |    line 6
________________

我已经在SO和网络上阅读了很多答案,我发现使用master..sptvalues应该是解决方案。特别是,我试图重现问题Split one column into multiple rows的解决方案。 但是,没有成功(怀疑有两个字符导致问题)。

到现在为止,我写了这个查询:

SELECT
    T.ID,
    T.Value, 
    RIGHT(LEFT(T.Value,spt.Number-1),
    CHARINDEX(char(13)+char(10),REVERSE(LEFT(char(13)+char(10)+T.Value,spt.Number-1)))) as Extracted
FROM 
    master..spt_values spt,
    ContactsNew T
WHERE
    Type = 'P' AND 
    spt.Number BETWEEN 1 AND LEN(T.Value)+1
    AND
        (SUBSTRING(T.Value,spt.Number,2) = char(13)+char(10) OR SUBSTRING(T.Value,spt.Number,2) = '')

遗憾的是,此查询正在返回:

ID   |    Value    |   Extracted
________________________________   
1    |    line 1   |   <blank>
          line 2   |   
________________________________   
1    |    line 1   |   line 2
          line 2   |   
________________________________   
2    |    line 3   |   <blank>
________________________________   
3    |    line 4   |   <blank>
          line 5   |
          line 6   |
________________________________   
3    |    line 4   |   line 5
          line 5   |   line 6
          line 6   |
________________________________   
3    |    line 4   |   line 6
          line 5   |
          line 6   |
________________________________  

<blank>是一个空字符串,而不是空字符串。

我很感激帮助调整我的查询。

[Edit2] 我的源表包含的记录少于200条,并且性能不是必需的,所以我的目标是一个简单的解决方案,而不是一个有效的解决方案 [Edit2]

[Edit3] 源数据库是只读的。我无法添加存储过程,函数或clr类型。我必须在一个查询中执行此操作。的 [EDIT3]

[Edit4] 奇怪的是......似乎空格也被视为分隔符。

如果我运行以下查询:

SELECT
    T.ID,
    replace(T.Value, '#', ' '), 
    replace(RIGHT(
        LEFT(T.Value,spt.Number-1),
        CHARINDEX( char(13) + char(10),REVERSE(LEFT(char(10) + char(13)+T.Value,spt.Number-0)))
        ), '#', ' ')
FROM 
    master..spt_values spt,
    (   
        select contactID, 
        replace(Value,' ', '#') Value
        from ContactsNew where Value is not null
    ) T
WHERE
    Type = 'P' AND 
    spt.Number BETWEEN 1 AND LEN(T.Value)+1
    AND
        (SUBSTRING(T.Value,spt.Number,2) =  char(13) + char(10) OR SUBSTRING(T.Value,spt.Number,1) = '')

在运行此查询时,我得到了正确的返回数(但仍然有错误的值):

SELECT
    T.ID,
    T.Value, 
    RIGHT(
        LEFT(T.Value,spt.Number-1),
        CHARINDEX( char(13) + char(10),REVERSE(LEFT(char(10) + char(13)+T.Value,spt.Number-0)))
        )
FROM 
    master..spt_values spt,
    (   
        select contactID, 
        Value
        from ContactsNew where Value is not null
    ) T
WHERE
    Type = 'P' AND 
    spt.Number BETWEEN 1 AND LEN(T.Value)+1
    AND
        (SUBSTRING(T.Value,spt.Number,2) =  char(13) + char(10) OR SUBSTRING(T.Value,spt.Number,1) = '')

分裂空格

2 个答案:

答案 0 :(得分:1)

编辑#1:我删除了原始答案文本。尝试以下查询。我略微修改了你的逻辑。如果您对此有任何疑问,请随时发表评论。如果您需要另一个拆分分隔符,只需引入另一个嵌套查询,用CHAR(13)+ CHAR(10)替换该分隔符。

SELECT 
* 
FROM 
(
    SELECT
        T.ID,
        T.Value,
        CASE
            WHEN CHARINDEX(CHAR(13) + CHAR(10), SUBSTRING(T.Value, spt.number, LEN(T.Value) - spt.Number + 1)) > 0 THEN
                LEFT(
                    SUBSTRING(T.Value, spt.number, LEN(T.Value) - spt.Number + 1), 
                    CHARINDEX(CHAR(13) + CHAR(10), SUBSTRING(T.Value, spt.number, LEN(T.Value) - spt.Number + 1)) - 1)
 /* added by Steve B. see comments for the reasons */
        when len(T.Value) = spt.Number then right(t.Value, spt.number -1) 
 /* end of edit */
            ELSE
                SUBSTRING(T.Value, spt.number, LEN(T.Value) - spt.Number + 1)
        END EXTRACTED
    FROM 
        master..spt_values spt,
        ContactsNew T
    WHERE
        Type = 'P' AND 
        spt.Number BETWEEN 1 AND LEN(T.Value)+1
) X
WHERE 
    EXTRACTED <> '' AND
    (
        LEFT(X.VALUE, LEN(EXTRACTED)) = EXTRACTED OR 
        X.Value LIKE '%' + CHAR(13) + CHAR(10) + EXTRACTED + CHAR(13) + CHAR(10) + '%' OR
        X.Value LIKE '%' + CHAR(13) + CHAR(10) + EXTRACTED
    )

答案 1 :(得分:0)

示例查询,显示如何针对与描述类似的某些测试数据执行此类操作。

如果您无法在最终语句中声明变量,则可以找到/替换它们的值,但这会使事情变得更简单。

这可以通过在进行拆分之前将CR+LF替换为单个字符来实现 如果您的数据中使用了'|',请选择另一个不用作临时分隔符的单个字符。

declare @crlf nvarchar(2) = char(10) + char(13)
declare @cDelim nvarchar(1) = N'|'

-- test data
declare @t table
(id int
,value nvarchar(3072))

insert @t
select 1, 'line1' + @crlf + 'line2'
union all select 2, 'line3'
union all select 3, 'line4' + @crlf + 'line5' + @crlf + 'line6'
-- /test data



;WITH charCTE
AS
( 
        --split the string into a dataset
        SELECT  D.id, D.value, SUBSTRING(D.s,n,CHARINDEX(@cDelim, D.s + @cDelim,n) -n) AS ELEMENT
        FROM (SELECT id, value, REPLACE(value,@crlf,@cDelim) as s from @t)    AS D
        JOIN (SELECT TOP 3072 ROW_NUMBER() OVER (ORDER BY a.type, a.number, a.name) AS n
              FROM master.dbo.spt_values a 
              CROSS 
              JOIN master.dbo.spt_values b 
              ) AS numsCte
        ON n <= LEN(s)
        AND SUBSTRING(@cDelim + s,n,1) = @cDelim 
)
SELECT id, ELEMENT
FROM charCTE
order by id, element