我在两个独立的SQL服务器数据库中有两个垃圾填埋处理表(DW和VT),其中包含几乎相同的信息。我试图将它们合并到VT表中。 DW中的表格包含最新信息,如下所示:
TicketNumber-----------ContainerID
------12345------------------12
------12345------------------17
------12345------------------22
------23456------------------12
------23456------------------14
------23456------------------32
------23456------------------12
VT中的表格如下:
TicketNumber-----------ContainerID------Pickups
------12345------------------12--------------1
------12345------------------17--------------1
------23456------------------12--------------1
我希望组合的VT表看起来像这样:
TicketNumber-----------ContainerID------Pickups
------12345------------------12---------------1
------12345------------------17---------------1
------12345------------------22---------------1
------23456------------------12---------------2
------23456------------------14---------------1
------23456------------------32---------------1
问题在于两者之间存在一些重叠,我只想将DW中缺失的行插入到VT表中,但我无法通过where子句来共同比较两个不同的值。它是分别比较每个值。
另外,我想计算DW中相同记录的数量,并将该计数放入[Pickups]列(Ticket 23456容器12)。我需要确保VT中不存在的任何DW行都以正确的计数插入到该表中,如果它们存在于VT中,我想将[Pickups]更新为正确的计数。
这是我到目前为止的代码:
INSERT INTO [VT]
(TicketNumber, ContainerID, Pickups)
SELECT DISTINCT
DW.[TICKET NUMBER],
DW.[CONTAINER ID],
'??????', --Not sure how to code this to count rows.
FROM [DW]
WHERE ?????
这些表是巨大的(想想DW中超过一百万行),所以请记住这一点。我尝试了一个不存在的东西,它在我停止之前跑了20分钟。谢谢你的帮助。
答案 0 :(得分:0)
您应该尝试合并/更新语句。 首先,我会改变表VT以添加“Pickup”列(或以其他方式创建表VT_final,您可以在其中放置结果)。 假设您在现有的VT表中添加了一个“Pickup”列,我会这样做:
;WITH DW_New AS
(
SELECT
TicketNumber
,ContainerID
,Pickup = ROW_NUMBER() OVER(PARTITION BY TicketNumber, COntainerID) --Let's count the "duplicate" lines from DW
FROM DW (NOLOCK)
)
MERGE VT AS tgt
USING DW_New AS src
ON src.ticketnumber = tgt.ticketnumber
AND src.containerID = TGT.ContainerID
WHEN MATCHED AND --This is when you have already a line in VT containing the same ticket and container
(
src.Pickup != tgt.Pickup
)
THEN UPDATE SET
tgt.Pickup = tgt.Pickup + src.Pickup --Here I assume that you want to add the "Pickup" value from DW to the existing VT "Pickup" value. Otherwise you can rework this.
WHEN NOT MATCHED --This is your insert : taking DW data into VT when VT doesn't contain an existing line for the container and ticket
THEN
INSERT
(
TicketNumber
,ContainerID
,Pickup
)
VALUES
(
src.TicketNumber
,src.ContainerID
,src.Pickup
)
请注意,如果您多次运行,我认为您可能想要更新VT中的“拾取”值,并且我假设您只需添加DW“拾取”值。 当然,我不确定你每次运行时都会“重置”(即截断)DW表,所以你可能想重新设计它以满足你的需要。
修改强> 在我的回答中我忘记了两件事:
我会在您的目标表上尝试此索引(假设您已经有一个主键,这意味着您无法创建其他聚簇索引):
CREATE NONCLUSTERED INDEX [IX_VT_Index] ON VT
(
TicketNumber ASC,
ContainerID ASC,
)
INCLUDE (Pickup) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
答案 1 :(得分:0)
为什么不重新总结一下DW表?
truncate table vt;
insert into vt(TicketNumber, ContainerID, Pickups)
select TicketNumber, ContainerID, count(*) as Pickups
from Dw
group by TicketNumber, ContainerID;