从一个表迁移到另一个表时避免重复

时间:2018-03-01 13:28:09

标签: tsql

我需要从旧的Books表中迁移数据:

create table dbo.Books_OLD ( 
  Id int identity not null constraint PK_Books_OLD_Id primary key (Id),
  Title nvarchar (200) not null,
  Image varbinary (max) null, 
  Preview varbinary (max) null
) 

到新的表结构:

create table dbo.Books ( 
  Id int identity not null constraint PK_Books_Id primary key (Id),
  Title nvarchar (200) not null 
)    

create table dbo.Files (
  Id int identity not null constraint PK_Files_Id primary key (Id),
  Content varbinary (max) null,
  Name nvarchar (280) null
)

create table dbo.BookFiles (
  BookId int not null, 
  FileId int not null, 
    constraint PK_BookFiles_Id primary key (BookId, FileId)
)

alter table dbo.BookFiles
add constraint FK_BookFiles_BookId foreign key (BookId) references Books(Id) on delete cascade on update cascade,
    constraint FK_BookFiles_FileId foreign key (FileId) references Files(Id) on delete cascade on update cascade;

迁移应按如下方式运行:

Books_OLD.Title => Create new Book with given Title value
Books_OLD.Image => Create new File with Image content.
                   Create new BookFile to associate File to Book.
Books_OLD.Preview => Create new File with Preview content.
                     Create new BookFile to associate File to Book.

我能够迁移数据但不知怎的,当我运行它时:

select FileId
from BookFiles
group by FileId
having count(*) > 1;

我有重复。我不应该有重复的FileIds。我错过了什么?

我的迁移代码是:

DECLARE @BOOKS table (
  BookId int,
  Image varbinary(max),
  Preview varbinary(max)
)

MERGE Books AS d
USING Books_OLD AS s
ON 0 = 1
WHEN NOT MATCHED
THEN INSERT (Title)
VALUES (s.Title)
OUTPUT INSERTED.Id, s.Image, s.Preview
INTO @BOOKS;

INSERT Files (Content, Created)
SELECT t.Content, GETUTCDATE()
FROM @BOOKS i
CROSS APPLY (VALUES (Preview, 'Preview'), (Image, 'Image')) t(Content, ContentType)
WHERE Content IS NOT NULL

INSERT BookFiles (BookId, FileId)
SELECT i.BookId, f.Id
FROM @BOOKS i
JOIN Files f
ON f.Content = i.Image

UNION ALL

SELECT i.BookId, f.Id
FROM @BOOKS i
JOIN Files f
ON f.Content = i.Preview

某些图书可以有两个文件(图片和预览),因此BookId可以在BooksFiles中出现多次。

但Books_OLD表中的每个文件(图像或预览)只应与一本书相关联。所以我在BookFiles中重复了FileId很奇怪。

我错过了什么?

1 个答案:

答案 0 :(得分:1)

如果您image中的不同图书有previewBooks_Old,请使用此部分的原始代码:

INSERT BookFiles (BookId, FileId)
SELECT i.BookId, f.Id
FROM @BOOKS i
JOIN Files f
ON f.Content = i.Image

在执行INNER JOIN时会返回更多结果,因为可以加入来自不同图书的两个imagepreview。重复的FileId实际上是一个错误的记录,因为BookId与特定的ImagePreview不对应,即使它们是相同的。

你可以做的是有另一个名为@Files的表变量,类似于Files表结构,你只需要再添加一列BookId,然后:< / p>

INSERT BookFiles (BookId, FileId)
SELECT i.BookId, f.Id
FROM @BOOKS i
JOIN @Files f
ON f.Content = i.Image
AND f.BookId = i.BookId  --added joining condition
--assume code before has inserted bookId into `@Files`

最后,您从@Files中选择所有必需的列,将其插入Files

更新:请参阅下面的完整代码:

   DECLARE @BOOKS table (
  BookId int,
  Image varbinary(max),
  Preview varbinary(max)
)
--Added @File Variable
DECLARE @Files table
( 
BookId int,
Content varbinary (max) null,
Created nvarchar (280) null,
Id int identity(1,1) not null primary key
)  

MERGE Books AS d
USING Books_OLD AS s
ON 0 = 1
WHEN NOT MATCHED
THEN INSERT (Title)
VALUES (s.Title)
OUTPUT INSERTED.Id, s.Image, s.Preview
INTO @BOOKS;

INSERT @Files (BookId,Content, Created) --
SELECT i.BookId,t.Content, GETUTCDATE()
FROM @BOOKS i
CROSS APPLY (VALUES (Preview, 'Preview'), (Image, 'Image')) t(Content, ContentType)
WHERE Content IS NOT NULL

INSERT BookFiles (BookId, FileId)
SELECT i.BookId, f.Id
FROM @BOOKS i
JOIN @Files f
ON f.Content = i.[Image]
AND f.BookId = i.BookId  --added joining condition

UNION ALL

SELECT i.BookId, f.Id
FROM @BOOKS i
JOIN @Files f
ON f.Content = i.Preview
AND f.BookId = i.BookId  --added joining condition

--Last insert all needed from @File into File
INSERT INTO Files (Content, Created)
SELECT content,Created
FROM @Files

PS:不确定dbo.File是否存在拼写错误,您的表定义中有Name,但在插入时,Created