我在SQL Server DB中有一个表'Objects'。它包含对象的名称(字符串)。 我有一个新对象的名称列表,需要在“对象”表中插入一个单独的表'NewObjects'。此操作将在此后称为“导入”。
如果记录名已经存在于'Objects'中,我需要为每个要从'NewObjects'导入'Objects'的记录生成一个唯一的名称。这个新名称将存储在“NewObjects”表中,与旧名称相对应。
DECLARE @NewObjects TABLE
(
...
Name varchar(20),
newName nvarchar(20)
)
我已经实现了一个存储过程,它为从“NewObjects”导入的每个记录生成唯一的名称。但是,我对1000条记录的性能不满意(在'NewObjects'中)。 我想要帮助来优化我的代码。以下是实施:
PROCEDURE [dbo].[importWithNewNames] @args varchar(MAX)
-- Sample of @args is like 'A,B,C,D' (a CSV string)
...
DECLARE @NewObjects TABLE
(
_index int identity PRIMARY KEY,
Name varchar(20),
newName nvarchar(20)
)
-- 'SplitString' function: this is a working implementation which is right now not concern of performance
INSERT INTO @NewObjects (Name)
SELECT * from SplitString(@args, ',')
declare @beg int = 1
declare @end int
DECLARE @oldName varchar(10)
-- get the count of the rows
select @end = MAX(_index) from @NewObjects
while @beg <= @end
BEGIN
select @oldName = Name from @NewObjects where @beg = _index
Declare @nameExists int = 0
-- this is our constant. We cannot change
DECLARE @MAX_NAME_WIDTH int = 5
DECLARE @counter int = 1
DECLARE @newName varchar(10)
DECLARE @z varchar(10)
select @nameExists = count(name) from Objects where name = @oldName
...
IF @nameExists > 0
BEGIN
-- create name based on pattern 'Fxxxxx'. Example: 'F00001', 'F00002'.
select @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
while EXISTS (select top 1 1 from Objects where name = @newName)
OR EXISTS (select top 1 1 from @NewObjects where newName = @newName)
BEGIN
select @counter = @counter + 1
select @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
END
select top 1 @z = @newName from Objects
update @NewObjects
set newName = @z where @beg = _index
END
select @beg = @beg + 1
END
-- finally, show the new names generated
select * from @NewObjects
答案 0 :(得分:2)
免责声明:我无法测试这些建议,因此可能存在语法错误,您必须在实施时自行解决这些错误。它们既可以作为修复此程序的指南,也可以帮助您提高未来项目的技能。
一个优化只是浏览,当你在更大的集合上迭代时会更加普遍,这个代码在这里:
select @nameExists = count(name) from Objects where name = @oldName
...
IF @nameExists > 0
考虑将其更改为:
IF EXISTS (select name from Objects where name = @oldName)
此外,而不是这样做:
-- create name based on pattern 'Fxxxxx'. Example: 'F00001', 'F00002'.
select @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
while EXISTS (select top 1 1 from Objects where name = @newName)
OR EXISTS (select top 1 1 from @NewObjects where newName = @newName)
BEGIN
select @counter = @counter + 1
select @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
END
考虑一下:
DECLARE @maxName VARCHAR(20)
SET @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
SELECT @maxName = MAX(name) FROM Objects WHERE name > @newName ORDER BY name
IF (@maxName IS NOT NULL)
BEGIN
@counter = CAST(SUBSTRING(@maxName, 2) AS INT)
SET @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
END
这将确保您不会迭代并执行多个查询,只是为了找到生成的名称的最大整数值。
此外,基于我所拥有的小环境,您还应该能够再进行一次优化,以确保您只需要执行上述一次永远。
DECLARE @maxName VARCHAR(20)
SET @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
IF (@beg = 1)
BEGIN
SELECT @maxName = MAX(name) FROM Objects WHERE name > @newName ORDER BY name
IF (@maxName IS NOT NULL)
BEGIN
@counter = CAST(SUBSTRING(@maxName, 2) AS INT)
SET @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
END
END
我说你可以进行优化的原因是因为除非你不得不担心其他实体在此期间插入记录看起来像你(例如Fxxxxx),那么你只有找到MAX一次,可以在循环中 迭代 @counter
。
事实上, 你实际上可以将整个 out 拉出来。你应该能够很容易地推断它。只需将DECLARE
的{{1}}和SET
与@counter
内的代码一起拉出即可。 但请一步一步。
另外,请更改此行:
IF (@beg = 1)
到此:
select top 1 @z = @newName from Objects
因为 字面 运行查询SET @z = @newName
两个局部变量。 这可能是导致性能问题的一个重要原因。除非您实际设置SET
语句中的变量,否则您需要进行的良好做法,对局部变量使用SELECT
操作。您的代码中还有其他一些适用的地方,请考虑以下这一行:
SET
改为使用:
select @beg = @beg + 1
最后,如上所述,只需 迭代 SET @beg = @beg + 1
,在您拥有此行的循环结束时:
@counter
只需添加一行:
select @beg = @beg + 1
你是金色的!
所以回顾一下,你可以收集最大的冲突名称一次,这样你就可以摆脱所有这些迭代。您将开始使用SET @counter = @counter + 1
来摆脱像SET
这样的性能线,您实际上是在查询表以设置两个局部变量。而且您将利用select top 1 @z = @newName from Objects
方法而不是设置利用EXISTS
函数AGGREGATE
的变量来完成这项工作。
让我知道这些优化是如何运作的。
答案 1 :(得分:1)
你应该避免循环内的查询..特别是如果这是在一个表变量...
您应该尝试使用临时表并在newname列上为此表编制索引。我打赌它会提高一点性能..
但是你会更好地重写它,避免那些带有查询的循环......
设置环境以进行测试...
--this would be your object table... I feed it with some values for test
DECLARE @Objects TABLE
(
_index int identity PRIMARY KEY,
Name varchar(20)
)
insert into @Objects(name)
values('A'),('A1'),('B'),('F00001')
--the parameter of your procedure
declare @args varchar(MAX)
set @args = 'A,B,C,D,F00001'
--@NewObjects2 is your @NewObjects just named the n2 cause I did run your solution together when testing
DECLARE @NewObjects2 TABLE
(
_index int identity PRIMARY KEY,
Name varchar(20),
newName nvarchar(20)
)
INSERT INTO @NewObjects2 (Name)
SELECT * from SplitString(@args, ',')
declare @end int
select @end = MAX(_index) from @NewObjects2
DECLARE @MAX_NAME_WIDTH int = 5
此时它的解决方案非常相似
现在我会做什么而不是你的循环
--generate newNames in format FXXXXX with free names sufficient to give newnames for all lines in @newObject
--you should alter this to get the greater FXXXXX name inside the Objects and start generate newNames from this point.. to avoid overhead creating newNames that will sure not to be used..
with N_free as
(
select
0 as [count],
'F' + REPLACE(STR(0, @MAX_NAME_WIDTH, 0), ' ', '0') as [newName],
0 as fl_free,
0 as count_free
union all
select
N.[count] + 1 as [count],
'F' + REPLACE(STR(N.[count]+1, @MAX_NAME_WIDTH, 0), ' ', '0') as [newName],
OA.fl_free,
count_free + OA.fl_free as count_free
from
N_free N
outer apply
(select
case
when not exists(select name from @Objects
where Name = 'F' + REPLACE(STR(N.[count]+1, @MAX_NAME_WIDTH, 0), ' ', '0'))
then 1
else 0
end as fl_free) OA
where
N.count_free < @end
)
--return only those newNames that are free to be used
,newNames as (select ROW_NUMBER() over (order by [count]) as _index_name
,[newName]
from N_free where fl_free = 1
)
--update the @NewObjects2 giving newname for the ones that got the name already been used on Objects
update N2
set newName = V2.[newName]
from @NewObjects2 N2
inner join (select V._index,V.Name,newNames.[newName]
from( select row_number() over (partition by case when O.Name is not null
then 1
else 0
end
order by N._index) as _index_name
,N._index
,N.Name
,case when O.Name is not null
then 1
else 0
end as [fl_need_newName]
from @NewObjects2 N
left outer join @Objects O
on O.Name = N.Name
)V
left outer join newNames
on newNames._index_name = V._index_name
and V.fl_need_newName = 1
)V2
on V2._index = N2._index
option(MAXRECURSION 0)
select * from @NewObjects2
我实现的结果与使用此环境解决方案相同......
您可以检查这是否真的产生相同的结果......
此查询的结果是
_index Name newName
1 A F00002
2 B F00003
3 C NULL
4 D NULL
5 F00001 F00004