我正在尝试将数据从一个表聚合到另一个表中。我继承了这个项目;我没有设计这个数据库,也没有能够改变它的格式。
每个ChannelCodeID,[RawData]表每个帐户将有1条记录。此表(我当前有数据)包含以下字段:
[Account] int
[ChannelCodeID] int
[ChannelCode] varchar(10)
[AggregatedData]表每个帐户将有1条记录。该表(我需要插入数据)包含以下字段:
[Account] int
[Count] int
[Channel1] int
[Channel2] int
[Channel3] int
[Names] varchar(250)
例如,我的[RawData]表中可能有以下记录:
Account ChannelCodeID ChannelCode
12345 2 ABC
12345 4 DEF
12345 6 GHI
54321 2 ABC
54321 6 GHI
99999 2 ABC
并且,在聚合它们之后,我需要在我的[AggregatedData]表中生成以下记录:
Account Count Chanel1 Channel2 Channel3 Names
12345 3 2 4 6 ABC.DEF.GHI
54321 2 2 6 0 ABC.GHI
99999 1 2 0 0 ABC
如您所见,计数是我的[RawData]表中存在多少条记录,Channel1是第一个ChannelCodeID,Channel2是第二个,Channel3是第三个。如果我的[RawData]表中没有足够的ChannelCodeID,则额外的Channel列会得到' 0'值。此外,我需要连接“ChannelCode'列并将其存储在'名称' [AggregatedData]表的列,但(显然)如果只有一条记录,我不想添加'。'
我无法在不使用光标和一堆变量的情况下弄清楚如何做到这一点 - 但我猜测它是一种更好的方法。这不是必须超快,因为它每月只运行一次,但每次必须处理至少10-15,000条记录。
提前致谢...
修改
ChannelCodes和ChannelCodeID直接相互映射,并且始终相同。例如,ChannelCodeID 2总是' ABC'
此外,在[AggregatedData]表中,Channel1始终是最低值,尽管这是偶然的。
答案 0 :(得分:3)
DECLARE @TABLE TABLE (Account INT, ChannelCodeID INT, ChannelCode VARCHAR(10))
INSERT INTO @TABLE VALUES
(12345 ,2 ,'ABC'),
(12345 ,4 ,'DEF'),
(12345 ,6 ,'GHI'),
(54321 ,2 ,'ABC'),
(54321 ,6 ,'GHI'),
(99999 ,2 ,'ABC')
SELECT Account
,[Count]
,ISNULL([Channel1], 0) AS [Channel1]
,ISNULL([Channel2], 0) AS [Channel2]
,ISNULL([Channel3], 0) AS [Channel3]
,Names
FROM
(
SELECT t.Account, T.ChannelCodeID, C.[Count]
,'Channel' + CAST(ROW_NUMBER() OVER
(PARTITION BY t.Account ORDER BY t.ChannelCodeID ASC) AS VARCHAR(10))Channels
,STUFF((SELECT '.' + ChannelCode
FROM @TABLE
WHERE Account = t.Account
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'),1,1,'') AS Names
FROM @TABLE t INNER JOIN (SELECT Account , COUNT(*) AS [Count]
FROM @TABLE
GROUP BY Account) c
ON T.Account = C.Account
)A
PIVOT (MAX(ChannelCodeID)
FOR Channels
IN ([Channel1],[Channel2],[Channel3])
) p
╔═════════╦═══════╦══════════╦══════════╦══════════╦═════════════╗
║ Account ║ Count ║ Channel1 ║ Channel2 ║ Channel3 ║ Names ║
╠═════════╬═══════╬══════════╬══════════╬══════════╬═════════════╣
║ 12345 ║ 3 ║ 2 ║ 4 ║ 6 ║ ABC.DEF.GHI ║
║ 54321 ║ 2 ║ 2 ║ 6 ║ 0 ║ ABC.GHI ║
║ 99999 ║ 1 ║ 2 ║ 0 ║ 0 ║ ABC ║
╚═════════╩═══════╩══════════╩══════════╩══════════╩═════════════╝
答案 1 :(得分:1)
- 将原始数据备份到临时表
select * into #rawData FROM RawData
- 首先,填充最低通道和基本记录
INSERT INTO AggregatedData (Account,Count,Channel1,Channel2,Channel3)
SELECT AccountID,1,Min(ChannelCODEID),0,0
FROM #RawData
GROUP BY AccountID
- 给你这样的东西
Account Count Chanel1 Channel2 Channel3 Names
12345 1 2 0 0 NULL
54321 1 2 6 0 NULL
99999 1 2 0 0 NULL
-
DELETE FROM #rawData
WHERE account + str(channelCodeID) in
(SELECT account + str(channelCodeID) FROM AggregatedData)
- 现在进行更新
UPDATE AggregatedData SET channel2= xx.NextLowest,count= count+1
FROM
( SELECT AccountID,Min(ChannelCODEID) as NextLowest
FROM #RawData
GROUP BY AccountID ) xx
WHERE AggregatedData.account=xx.accountID
- 重复上面的Channel3
然后,您需要根据通道ID对最终聚合表进行更新声明。如果不经常运行,我会建议一个UDF,它接受3个参数并返回一个字符串,有些像
UPDATE AggregatedData SET [names] = dbo.BuildNameList(channel1,channel2,channel3)
运行有点慢,但总体上还不错
希望这会给你一些想法
答案 2 :(得分:0)
WITH CTE AS (SELECT Account, ChannelCodeID, ChannelCode, RANK() OVER (PARTITION BY Account ORDER BY ChannelCodeID) [ChRank] FROM RawData)
SELECT A.Account, COUNT(Account) [Count], ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=1),0) [Channel1],
ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=2),0) [Channel2],
ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=3),0) [Channel3],
STUFF((SELECT '.'+ChannelCode FROM CTE WHERE A.Account=CTE.Account FOR XML PATH('')),1,1,'') [Names]
FROM RawData A
GROUP BY A.Account
这使用Common Table Expression
进行分组,然后显示数据。