我的数据看起来像这个例子(不幸的是规模更大):
+----+-------+--------------------+-----------------------------------------------+
| ID | Data | Cost | Comments |
+----+-------+--------------------+-----------------------------------------------+
| 1 | 1|2|3 | $0.00|$3.17|$42.42 | test test||previous thing has a blank comment |
+----+-------+--------------------+-----------------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-------+--------------------+-----------------------------------------------+
| 3 | 1|2 | $3.50|$4.20 | |test |
+----+-------+--------------------+-----------------------------------------------+
我所拥有的表中的一些列是由管道分隔的,但每行都是一致的。因此,每个分隔值对应于同一行的其他列中的相同索引。
所以我可以做这样的事情,这就是我想要的一个列:
SELECT ID, s.value AS datavalue
FROM MyTable t CROSS APPLY STRING_SPLIT(t.Data, '|') s
这会给我这个:
+----+-----------+
| ID | datavalue |
+----+-----------+
| 1 | 1 |
+----+-----------+
| 1 | 2 |
+----+-----------+
| 1 | 3 |
+----+-----------+
| 2 | 1 |
+----+-----------+
| 3 | 1 |
+----+-----------+
| 3 | 2 |
+----+-----------+
但我也希望获得其他列(本例中的成本和注释),以便相应的项目都在同一行中:
+----+-----------+-----------+------------------------------------+
| ID | datavalue | costvalue | commentvalue |
+----+-----------+-----------+------------------------------------+
| 1 | 1 | $0.00 | test test |
+----+-----------+-----------+------------------------------------+
| 1 | 2 | $3.17 | |
+----+-----------+-----------+------------------------------------+
| 1 | 3 | $42.42 | previous thing has a blank comment |
+----+-----------+-----------+------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-----------+-----------+------------------------------------+
| 3 | 1 | $3.50 | |
+----+-----------+-----------+------------------------------------+
| 3 | 2 | $4.20 | test |
+----+-----------+-----------+------------------------------------+
我不确定实现这一目标的最佳或最简单的方法是
答案 0 :(得分:4)
由于Microsoft拒绝提供序号位置作为结果集的一部分,因此STRING_SPLIT
无法实现这一点。因此,您需要使用不同的功能。就个人而言,我推荐Jeff Moden的DelimitedSplit8k
。
然后,你可以这样做:
CREATE TABLE #Sample (ID int,
[Data] varchar(200),
Cost varchar(200),
Comments varchar(8000));
GO
INSERT INTO #Sample
VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment'),
(2,'1','$420.69','test'),
(3,'1|2','$3.50|$4.20','|test');
GO
SELECT S.ID,
DSd.Item AS DataValue,
DSc.Item AS CostValue,
DSct.Item AS CommentValue
FROM #Sample S
CROSS APPLY dbo.DelimitedSplit8K(S.[Data],'|') DSd
CROSS APPLY (SELECT *
FROM DelimitedSplit8K(S.Cost,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSc
CROSS APPLY (SELECT *
FROM DelimitedSplit8K(S.Comments,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSct;
GO
DROP TABLE #Sample;
GO
然而,这个问题只有一个真正的答案:不要在SQL Server中存储分隔值。以规范化的方式存储它们,你们不会有这个问题。
答案 1 :(得分:0)
这是一种使用递归 CTE 而不是的用户定义函数 (UDF) 的解决方案,这对于那些没有权限创建函数的人很有用。
CREATE TABLE mytable(
ID INTEGER NOT NULL PRIMARY KEY
,Data VARCHAR(7) NOT NULL
,Cost VARCHAR(20) NOT NULL
,Comments VARCHAR(47) NOT NULL
);
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (2,'1','$420.69','test');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (3,'1|2','$3.50|$4.20','|test');
此查询允许通过使用变量来选择分隔符,然后使用公共表表达式解析每个分隔字符串,为这些字符串的每个部分生成一行,并保留每个部分的序数位置。
declare @delimiter as varchar(1)
set @delimiter = '|'
;with cte as (
select id
, convert(varchar(max), null) as datavalue
, convert(varchar(max), null) as costvalue
, convert(varchar(max), null) as commentvalue
, convert(varchar(max), data + @delimiter) as data
, convert(varchar(max), cost + @delimiter) as cost
, convert(varchar(max), comments + @delimiter) as comments
from mytable as t
union all
select id
, convert(varchar(max), left(data, charindex(@delimiter, data) - 1))
, convert(varchar(max), left(cost, charindex(@delimiter, cost) - 1))
, convert(varchar(max), left(comments, charindex(@delimiter, comments) - 1))
, convert(varchar(max), stuff(data, 1, charindex(@delimiter, data), ''))
, convert(varchar(max), stuff(cost, 1, charindex(@delimiter, cost), ''))
, convert(varchar(max), stuff(comments, 1, charindex(@delimiter, comments), ''))
from cte
where (data like ('%' + @delimiter + '%') and cost like ('%' + @delimiter + '%')) or comments like ('%' + @delimiter + '%')
)
select id, datavalue, costvalue, commentvalue
from cte
where datavalue IS NOT NULL
order by id, datavalue
当递归添加新行时,它使用 left()
将分隔字符串的第一部分放入所需的输出列,然后使用 stuff()
从源字符串中删除最后使用的分隔符这样下一行将从下一个分隔符开始。请注意,为了启动提取,将分隔符添加到源分隔字符串的末尾,以确保 where 子句不排除任何想要的字符串。
结果:
id datavalue costvalue commentvalue
---- ----------- ----------- ------------------------------------
1 1 $0.00 test test
1 2 $3.17
1 3 $42.42 previous thing has a blank comment
2 1 $420.69 test
3 1 $3.50
3 2 $4.20 test