我有一组记录(表[#tmp_origin]),在字符串字段中包含重复的条目([Names])。我想将[#tmp_origin]的全部内容插入到目标表[#tmp_destination]中,该表不允许重复,并且可能已包含项目。
如果目标表中不存在源表中的字符串,则只需将in插入到目标表中即可。 如果目标表中的条目已存在且原始表中的条目值相同,则在将字符串插入目标表之前,必须将字符串附加的增量编号附加到该字符串。
在此示例脚本中,使用游标实现了以这种方式移动数据的过程:
-- create initial situation (origin and destination table, both containing items) - Begin CREATE TABLE [#tmp_origin] ([Names] VARCHAR(10)) CREATE TABLE [#tmp_destination] ([Names] VARCHAR(10)) CREATE UNIQUE INDEX [IX_UniqueName] ON [#tmp_destination]([Names] ASC) INSERT INTO [#tmp_origin]([Names]) VALUES ('a') INSERT INTO [#tmp_origin]([Names]) VALUES ('a') INSERT INTO [#tmp_origin]([Names]) VALUES ('b') INSERT INTO [#tmp_origin]([Names]) VALUES ('c') INSERT INTO [#tmp_destination]([Names]) VALUES ('a') INSERT INTO [#tmp_destination]([Names]) VALUES ('a_1') INSERT INTO [#tmp_destination]([Names]) VALUES ('b') -- create initial situation - End DECLARE @Name VARCHAR(10) DECLARE NamesCursor CURSOR LOCAL FORWARD_ONLY FAST_FORWARD READ_ONLY FOR SELECT [Names] FROM [#tmp_origin]; OPEN NamesCursor; FETCH NEXT FROM NamesCursor INTO @Name; WHILE @@FETCH_STATUS = 0 BEGIN DECLARE @finalName VARCHAR(10) SET @finalName = @Name DECLARE @counter INT SET @counter = 1 WHILE(1=1) BEGIN IF NOT EXISTS(SELECT * FROM [#tmp_destination] WHERE [Names] = @finalName) BREAK; SET @finalName = @Name + '_' + CAST(@counter AS VARCHAR) SET @counter = @counter + 1 END INSERT INTO [#tmp_destination] ([Names]) ( SELECT @finalName ) FETCH NEXT FROM NamesCursor INTO @Name; END CLOSE NamesCursor; DEALLOCATE NamesCursor; SELECT * FROM [#tmp_destination] /* Expected result: a a_1 a_2 a_3 b b_1 c */ DROP TABLE [#tmp_origin] DROP TABLE [#tmp_destination]
这可以正常工作,但是当要插入的项目数量增加时,其性能会大幅下降。
有什么想加快速度吗?
感谢
答案 0 :(得分:5)
使用窗口功能可以对重复项进行编号。您还可以从目的地表中获取计数(需要条件来剥去您添加的后缀):
select orig.names,
row_number() over (partition by orig.names order by orig.names) as rowNo,
dest.count
from ##tmp_origin orig
cross apply (select count(1) from #tmp_destination where names = orig.names) as dest
可以从上面构建insert
(如果大于零,则新后缀为rowNo + dest.count -1
。)
建议您重构目标临时表,将名称和后缀包含在单独的列中 - 这可能意味着有一个新的中间阶段 - 因为这将使匹配逻辑更加简单。
答案 1 :(得分:1)
这样的事情:
insert [#tmp_destination]
select CASE WHEN row_number() over(partition by Names order by Names) > 1 THEN Names + '_' + CONVERT(VARCHAR(10), row_number() over(partition by Names order by Names)) ELSE Names END
from [#tmp_origin]
答案 2 :(得分:1)
在这种情况下,我不会使用游标。相反,我将使用ROW_NUMBER()
构建查询。这样您就可以在原始表中添加一个计数器,然后使用此计数器附加到[Names]:
SELECT [Names], ROW_NUMBER() OVER (PARTITION BY [Names] ORDER BY [Names]) - 1 AS [counter]
INTO #tmp_origin_with_counter
FROM #tmp_origin
SELECT CONCAT([Names], IIF([counter] = 0, '', '_'+ CAST([counter] AS NVARCHAR)))
INTO #tmp_destination
FROM #tmp_origin_with_counter