我有一个包含非规范化数据的平面文件。有充分理由,我无法改变这一点。我需要将它放入规范化的相关表中,以便在LightSwitch中使用。数据不包含任何原始表的标识列值。我有四列:
Division Branch Position Location
规范化数据的模式是:分部包含分支。分支包含职位。位置和位置通过PositionLocationMappings表具有多对多关系。
我执行BULK INSERT将非规范化数据放入表中。然后,我逐行处理它,为每一行调用一个存储过程。源文件中有大约16,000行,需要27秒,这看起来有点慢。有没有办法更快地完成它?
这是在我的PostDeployment.sql脚本中:
DECLARE @division nvarchar(240)
DECLARE @branch nvarchar(240)
DECLARE @position nvarchar(240)
DECLARE @location nvarchar(60)
DECLARE myCursor CURSOR LOCAL FOR
SELECT DISTINCT Division,Branch,Position,Location
FROM [staging].BranchPositions
OPEN myCursor
FETCH NEXT FROM myCursor INTO @division, @branch, @position, @location
WHILE @@FETCH_STATUS = 0 BEGIN
EXECUTE [dbo].[usp_InsertBranchPositions] @division,@branch,@position,@location
FETCH NEXT FROM myCursor INTO @division, @branch, @position, @location
END
CLOSE myCursor
DEALLOCATE myCursor
这是存储过程:
ALTER PROCEDURE [dbo].[usp_InsertBranchPositions]
@division nvarchar(240),
@branch nvarchar(240),
@position nvarchar(240),
@location nvarchar(60)
AS
BEGIN
SET NOCOUNT ON;
BEGIN TRANSACTION
DECLARE @divisionTable TABLE (InsertedDivisionId int)
DECLARE @branchTable TABLE (InsertedBranchId int)
DECLARE @positionTable TABLE (InsertedPositionId int)
DECLARE @locationTable TABLE (InsertedLocationid int)
DECLARE @divisionId int
DECLARE @branchId int
DECLARE @positionId int
DECLARE @locationId int
SELECT @divisionId = [Id] FROM [dbo].[Divisions]
WHERE DivisionName = @division
IF @divisionId IS NULL
BEGIN
INSERT INTO [dbo].[Divisions] (DivisionName, IsDivisionActive)
VALUES (@division, 1)
SELECT @divisionId = SCOPE_IDENTITY()
END
SELECT @branchId = [Id] FROM [dbo].[Branches]
WHERE BranchName = @branch
IF @branchId IS NULL
BEGIN
INSERT INTO [dbo].[Branches] (BranchName, IsBranchActive, DivisionId)
VALUES (@branch, 1, @divisionId)
SELECT @branchId = SCOPE_IDENTITY()
END
SELECT @positionId = [Id] FROM [dbo].[Positions]
WHERE PositionName = @position
IF @positionId IS NULL
BEGIN
INSERT INTO [dbo].[Positions] (PositionName, IsPositionActive, BranchId)
VALUES (@position, 1, @branchId)
SELECT @positionId = SCOPE_IDENTITY()
END
SELECT @locationId = [Id] FROM [dbo].[Locations]
WHERE LocationName = @location
IF @locationId IS NULL
BEGIN
INSERT INTO [dbo].[Locations] (LocationName, IsLocationActive)
VALUES (@location, 1)
SELECT @locationId = SCOPE_IDENTITY()
END
INSERT INTO [dbo].[PositionLocationMappings] (PositionId, LocationId)
VALUES (@positionId, @locationId)
COMMIT TRANSACTION
END
答案 0 :(得分:2)
您可以使用基于集合的操作导入数据,而不是为每一行调用过程。
例如,您可以更改此代码段完成的工作:
IF @divisionId IS NULL
BEGIN
INSERT INTO [dbo].[Divisions] (DivisionName, IsDivisionActive)
VALUES (@division, 1)
SELECT @divisionId = SCOPE_IDENTITY()
END
要:
insert Divisions
(DivisionName, IsDivisionActive)
select distinct DivisionName
, 1
from BranchPositions
然后对于Branches
,您可以使用join
查找DivisionId
:
insert Branches
(BranchName, IsBranchActive, DivisionId)
select distinct BranchName
, 1
, d.Id
from BranchPositions bp
join Divisions d
on bp.DivisionName = d.DivisionName
等等。这应该快得多,我用它在一分钟内导入了数十亿行。