我知道“将多行合并到列表中”这个问题已被回答了一百万次,这里引用了一篇很棒的文章:Concatenating row values in transact sql
我需要同时将多行组合成多列的列表
ID | Col1 | Col2 ID | Col1 | Col2
------------------ => ------------------
1 A X 1 A X
2 B Y 2 B,C Y,Z
2 C Z
我尝试使用xml方法,但事实证明这对大型表来说非常慢
SELECT DISTINCT
[ID],
[Col1] = STUFF((SELECT ',' + t2.[Col1]
FROM #Table t2
WHERE t2.ID = t.ID
FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'),1,1,''),
[Col2] = STUFF((SELECT ',' + t2.[Col2]
FROM #Table t2
WHERE t2.ID = t.ID
FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'),1,1,''),
FROM #Table t
我目前的解决方案是使用分别构建每个ID行的存储过程。我想知道是否有其他方法可以使用(除了使用循环)
For each column, rank the rows to combine (partition by the key column)
End up with a table like
ID | Col1 | Col2 | Col1Rank | Col2Rank
1 A X 1 1
2 B Y 1 1
2 C Z 2 2
Create a new table containing top rank columns for each ID
ID | Col1Comb | Col2Comb
1 A X
2 B Y
Loop through each remaining rank in increasing order (in this case 1 iteration)
for irank = 0; irank <= 1; irank++
update n set
n.col1Comb = n.Col1Comb + ',' + o.Col1, -- so append the rank 2 items
n.col2comb = n.Col2Comb + ',' + o.Col2 -- if they are not null
from #newtable n
join #oldtable o
on o.ID = n.ID
where o.col1rank = irank or o.col2rank = irank
答案 0 :(得分:3)
在更新CTE的地方可以使用CTE技巧。
方法1:一个新的并行表,数据被复制并连接到其中:
CREATE TABLE #Table1(ID INT, Col1 VARCHAR(1), Col2 VARCHAR(1), RowID INT IDENTITY(1,1));
CREATE TABLE #Table1Concat(ID INT, Col3 VARCHAR(MAX), Col4 VARCHAR(MAX), RowID INT);
GO
INSERT #Table1 VALUES(1,'A','X'), (2,'B','Y'), (2,'C','Z');
GO
INSERT #Table1Concat
SELECT * FROM #Table1;
GO
DECLARE @Cat1 VARCHAR(MAX) = '';
DECLARE @Cat2 VARCHAR(MAX) = '';
; WITH CTE AS (
SELECT TOP 2147483647 t1.*, t2.Col3, t2.Col4, r = ROW_NUMBER()OVER(PARTITION BY t1.ID ORDER BY t1.Col1, t1.Col2)
FROM #Table1 t1
JOIN #Table1Concat t2 ON t1.RowID = t2.RowID
ORDER BY t1.ID, t1.Col1, t1.Col2
)
UPDATE CTE
SET @Cat1 = Col3 = CASE r WHEN 1 THEN ISNULL(Col1,'') ELSE @Cat1 + ',' + Col1 END
, @Cat2 = Col4 = CASE r WHEN 1 THEN ISNULL(Col2,'') ELSE @Cat2 + ',' + Col2 END;
GO
SELECT ID, Col3 = MAX(Col3)
, Col4 = MAX(Col4)
FROM #Table1Concat
GROUP BY ID
方法2 :将连接列直接添加到原始表并连接新列:
CREATE TABLE #Table1(ID INT, Col1 VARCHAR(1), Col2 VARCHAR(1), Col1Cat VARCHAR(MAX), Col2Cat VARCHAR(MAX));
GO
INSERT #Table1(ID,Col1,Col2) VALUES(1,'A','X'), (2,'B','Y'), (2,'C','Z');
GO
DECLARE @Cat1 VARCHAR(MAX) = '';
DECLARE @Cat2 VARCHAR(MAX) = '';
; WITH CTE AS (
SELECT TOP 2147483647 t1.*, r = ROW_NUMBER()OVER(PARTITION BY t1.ID ORDER BY t1.Col1, t1.Col2)
FROM #Table1 t1
ORDER BY t1.ID, t1.Col1, t1.Col2
)
UPDATE CTE
SET @Cat1 = Col1Cat = CASE r WHEN 1 THEN ISNULL(Col1,'') ELSE @Cat1 + ',' + Col1 END
, @Cat2 = Col2Cat = CASE r WHEN 1 THEN ISNULL(Col2,'') ELSE @Cat2 + ',' + Col2 END;
GO
SELECT ID, Col1Cat = MAX(Col1Cat)
, Col2Cat = MAX(Col2Cat)
FROM #Table1
GROUP BY ID;
GO
答案 1 :(得分:1)
试试这个 -
<强>查询1:强>
DECLARE @temp TABLE
(
ID INT
, Col1 VARCHAR(30)
, Col2 VARCHAR(30)
)
INSERT INTO @temp (ID, Col1, Col2)
VALUES
(1, 'A', 'X'),
(2, 'B', 'Y'),
(2, 'C', 'Z')
SELECT
r.ID
, Col1 = STUFF(REPLACE(REPLACE(CAST(d.x.query('/t1/a') AS VARCHAR(MAX)), '<a>', ','), '</a>', ''), 1, 1, '')
, Col2 = STUFF(REPLACE(REPLACE(CAST(d.x.query('/t2/a') AS VARCHAR(MAX)), '<a>', ','), '</a>', ''), 1, 1, '')
FROM (
SELECT DISTINCT ID
FROM @temp
) r
OUTER APPLY (
SELECT x = CAST((
SELECT
[t1/a] = t2.Col1
, [t2/a] = t2.Col2
FROM @temp t2
WHERE r.ID = t2.ID
FOR XML PATH('')
) AS XML)
) d
查询2:
SELECT
r.ID
, Col1 = STUFF(REPLACE(CAST(d.x.query('for $a in /a return xs:string($a)') AS VARCHAR(MAX)), ' ,', ','), 1, 1, '')
, Col2 = STUFF(REPLACE(CAST(d.x.query('for $b in /b return xs:string($b)') AS VARCHAR(MAX)), ' ,', ','), 1, 1, '')
FROM (
SELECT DISTINCT ID
FROM @temp
) r
OUTER APPLY (
SELECT x = CAST((
SELECT
[a] = ',' + t2.Col1
, [b] = ',' + t2.Col2
FROM @temp t2
WHERE r.ID = t2.ID
FOR XML PATH('')
) AS XML)
) d
<强>输出:强>
ID Col1 Col2
----------- ---------- ----------
1 A X
2 B,C Y,Z
答案 2 :(得分:0)
一个解决方案,至少在语法上是直接的,是使用用户定义的聚合来将值“加入”在一起。这确实需要SQLCLR,虽然有些人不愿意启用它,但它确实提供了一种基于集合的方法,不需要每列重新查询基表。加入与拆分相反,将创建一个以逗号分隔的单个行列表。
下面是一个使用SQL#(SQLsharp)库的简单示例,该库附带了一个名为Agg_Join()的用户定义聚合,它完全符合此处的要求。您可以从http://www.SQLsharp.com/下载免费版本的SQL#,从标准系统视图下载示例SELECT。 (公平地说,我是SQL#的作者,但是这个函数是免费提供的。)
SELECT sc.[object_id],
OBJECT_NAME(sc.[object_id]) AS [ObjectName],
SQL#.Agg_Join(sc.name) AS [ColumnNames],
SQL#.Agg_Join(DISTINCT sc.system_type_id) AS [DataTypes]
FROM sys.columns sc
GROUP BY sc.[object_id]
我建议您针对当前的解决方案对此进行测试,以确定哪一项对于您预计至少在未来一两年内所拥有的数据量最快。