使用string_split从多个列创建行

时间:2018-04-04 16:03:53

标签: sql sql-server split

我的数据看起来像这个例子(不幸的是规模更大):

+----+-------+--------------------+-----------------------------------------------+
| ID | Data  | Cost               | Comments                                      |
+----+-------+--------------------+-----------------------------------------------+
| 1  | 1|2|3 | $0.00|$3.17|$42.42 | test test||previous thing has a blank comment |
+----+-------+--------------------+-----------------------------------------------+
| 2  | 1     | $420.69            | test                                          |
+----+-------+--------------------+-----------------------------------------------+
| 3  | 1|2   | $3.50|$4.20        | |test                                         |
+----+-------+--------------------+-----------------------------------------------+

我所拥有的表中的一些列是由管道分隔的,但每行都是一致的。因此,每个分隔值对应于同一行的其他列中的相同索引。

所以我可以做这样的事情,这就是我想要的一个列:

SELECT ID, s.value AS datavalue
FROM MyTable t CROSS APPLY STRING_SPLIT(t.Data, '|') s

这会给我这个:

+----+-----------+
| ID | datavalue |
+----+-----------+
| 1  | 1         |
+----+-----------+
| 1  | 2         |
+----+-----------+
| 1  | 3         |
+----+-----------+
| 2  | 1         |
+----+-----------+
| 3  | 1         |
+----+-----------+
| 3  | 2         |
+----+-----------+

但我也希望获得其他列(本例中的成本和注释),以便相应的项目都在同一行中:

+----+-----------+-----------+------------------------------------+
| ID | datavalue | costvalue | commentvalue                       |
+----+-----------+-----------+------------------------------------+
| 1  | 1         | $0.00     | test test                          |
+----+-----------+-----------+------------------------------------+
| 1  | 2         | $3.17     |                                    |
+----+-----------+-----------+------------------------------------+
| 1  | 3         | $42.42    | previous thing has a blank comment |
+----+-----------+-----------+------------------------------------+
| 2  | 1         | $420.69   | test                               |
+----+-----------+-----------+------------------------------------+
| 3  | 1         | $3.50     |                                    |
+----+-----------+-----------+------------------------------------+
| 3  | 2         | $4.20     | test                               |
+----+-----------+-----------+------------------------------------+

我不确定实现这一目标的最佳或最简单的方法是

2 个答案:

答案 0 :(得分:4)

由于Microsoft拒绝提供序号位置作为结果集的一部分,因此STRING_SPLIT无法实现这一点。因此,您需要使用不同的功能。就个人而言,我推荐Jeff Moden的DelimitedSplit8k

然后,你可以这样做:

CREATE TABLE #Sample (ID int,
                      [Data] varchar(200),
                      Cost  varchar(200),
                      Comments varchar(8000));
GO
INSERT INTO #Sample
VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment'),
       (2,'1','$420.69','test'),
       (3,'1|2','$3.50|$4.20','|test');

GO
SELECT S.ID,
       DSd.Item AS DataValue,
       DSc.Item AS CostValue,
       DSct.Item AS CommentValue
FROM #Sample S
     CROSS APPLY dbo.DelimitedSplit8K(S.[Data],'|')  DSd
     CROSS APPLY (SELECT *
                  FROM DelimitedSplit8K(S.Cost,'|') SS
                  WHERE SS.ItemNumber = DSd.ItemNumber) DSc
     CROSS APPLY (SELECT *
                  FROM DelimitedSplit8K(S.Comments,'|') SS
                  WHERE SS.ItemNumber = DSd.ItemNumber) DSct;

GO
DROP TABLE #Sample;
GO

然而,这个问题只有一个真正的答案:不要在SQL Server中存储分隔值。以规范化的方式存储它们,你们不会有这个问题。

答案 1 :(得分:0)

这是一种使用递归 CTE 而不是的用户定义函数 (UDF) 的解决方案,这对于那些没有权限创建函数的人很有用。

CREATE TABLE mytable(
   ID       INTEGER  NOT NULL PRIMARY KEY 
  ,Data     VARCHAR(7) NOT NULL
  ,Cost     VARCHAR(20) NOT NULL
  ,Comments VARCHAR(47) NOT NULL
);
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (2,'1','$420.69','test');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (3,'1|2','$3.50|$4.20','|test');

此查询允许通过使用变量来选择分隔符,然后使用公共表表达式解析每个分隔字符串,为这些字符串的每个部分生成一行,并保留每个部分的序数位置。

declare @delimiter as varchar(1)
set @delimiter = '|'

;with cte as (
      select id
           , convert(varchar(max), null) as datavalue
           , convert(varchar(max), null) as costvalue
           , convert(varchar(max), null) as commentvalue
           , convert(varchar(max), data + @delimiter) as data
           , convert(varchar(max), cost + @delimiter) as cost
           , convert(varchar(max), comments + @delimiter) as comments
      from mytable as t
      union all
      select id
           , convert(varchar(max), left(data, charindex(@delimiter, data) - 1))
           , convert(varchar(max), left(cost, charindex(@delimiter, cost) - 1))
           , convert(varchar(max), left(comments, charindex(@delimiter, comments) - 1))
           , convert(varchar(max), stuff(data, 1, charindex(@delimiter, data), ''))
           , convert(varchar(max), stuff(cost, 1, charindex(@delimiter, cost), ''))
           , convert(varchar(max), stuff(comments, 1, charindex(@delimiter, comments), ''))
      from cte
      where (data like ('%' + @delimiter + '%') and cost like ('%' + @delimiter + '%')) or comments like ('%' + @delimiter + '%')
     )
select id, datavalue, costvalue, commentvalue
from cte
where datavalue IS NOT NULL
order by id, datavalue

当递归添加新行时,它使用 left() 将分隔字符串的第一部分放入所需的输出列,然后使用 stuff() 从源字符串中删除最后使用的分隔符这样下一行将从下一个分隔符开始。请注意,为了启动提取,将分隔符添加到源分隔字符串的末尾,以确保 where 子句不排除任何想要的字符串。

结果:

  id   datavalue   costvalue              commentvalue             
 ---- ----------- ----------- ------------------------------------ 
   1           1   $0.00       test test                           
   1           2   $3.17                                           
   1           3   $42.42      previous thing has a blank comment  
   2           1   $420.69     test                                
   3           1   $3.50                                           
   3           2   $4.20       test                

展示here at dbfiddle.uk