Sql Server将Natural Key转换为Surrogate Key

时间:2013-02-08 18:59:22

标签: sql sql-server sql-server-2008-r2 data-warehouse business-intelligence

我正试图找到一种方法来将我导入的某些数据上的自然键转换为某种代理键。

示例:

假设我有一个名为OrderFact的表格,其中包含有关我的订单的所有信息:

CREATE TABLE OrderFact
(
     Id INT IDENTITY PRIMARY KEY
    ,OrderId INT NOT NULL
    ,Amount INT NOT NULL
    ,Cost MONEY NOT NULL
    ,SaleDate DATE NOT NULL
);
GO    

INSERT INTO OrderFact (OrderId, Amount, Cost, SaleDate)
VALUES (1, 2, 12.00, '1/1/2012'), (3, 1, 6.00, '12/29/2011'), (4, 5, 1.00, '1/1/2012');

现在我从一些供应商处获得POS系统的所有订单数据,因此我的登台表如下所示:

CREATE TABLE OrderStaging
(
     OrderId INT
    ,Vendor INT
    ,Amount INT
    ,Cost MONEY
    ,SaleDate DATE
);
GO

INSERT INTO OrderStaging (OrderId, Vendor, Amount, Cost, SaleDate)
VALUES (1, 1, 2, 12.00, '1/1/2012'), (3, 2, 1, 6.00, '12/29/2011'), (4, 1, 5, 1.00, '1/1/2012');

现在我不关心谁下订单,但我想跟踪同一供应商在同一天下订单,因为它们算作Bulk Order我申请了特殊订单折扣。

我是否可以构建数据库,以便跟踪哪些订单是批量订单有效地将VendorSaleDate的自然键转换为某种代理键,以便我可以查找OrderFact.Id反对吗?

1 个答案:

答案 0 :(得分:0)

我最终创建了一个RelatedOrder表,其中代理键作为主键,自然键作为列。然后,我创建了一个RelatedOrderMapping表,用于将RelatedOrder OrderId映射到Fact表。示例(SQL Server 2008 R2的T-SQL):

--Create our RelatedOrder table which contains the Natural Key as its columns
CREATE TABLE RelatedOrder
(
     Id INT IDENTITY PRIMARY KEY
    ,VendorId INT NOT NULL
);
GO

--Create our mapping table for linking OrderFact to RelatedOrder
CREATE TABLE RelatedOrderMapping
(
     Id INT IDENTITY PRIMARY KEY
    ,RelatedOrderId INT NOT NULL REFERENCES RelatedOrder (Id)
    ,OrderFactId INT NOT NULL REFERENCES OrderFact (Id)
    ,OrderId INT NOT NULL
);
GO

--Insert one instance of each VendorId
INSERT INTO RelatedOrder (VendorId)
SELECT DISTINCT VendorId FROM OrderStaging;
GO

--Create a temp table to hold our output clause
CREATE TABLE #RelatedOrder
(
     Id INT IDENTITY PRIMARY KEY
    ,RelatedOrderId INT NOT NULL REFERENCES RelatedOrder (Id)
    ,OrderFactId INT NOT NULL REFERENCES OrderFact (Id)
    ,OrderId INT NOT NULL
);
GO

--Create our precomputed table to speed up the MERGE
CREATE TABLE #OrderStaging
(
     OrderId INT NOT NULL
    ,Amount INT NOT NULL
    ,Cost MONEY NOT NULL
    ,SaleDate DATE NOT NULL
    ,RelatedOrderId INT NOT NULL
);
GO

--Precompute our MERGE statement
INSERT INTO #OrderStaging(OrderId, Amount, Cost, SaleDate, RelatedOrderId)
SELECT OrderId, Amount, Cost, SaleDate, RelatedOrderId
FROM OrderStaging os
JOIN RelatedOrder ro
ON ro.VendorId = os.VendorId

--Insert our data into the fact table and output the result into our #RelatedOrder
MERGE OrderFact AS [Target]
USING #OrderStaging AS [Source]
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
    INSERT(OrderId, Amount, Cost, SaleDate)
    VALUES([Source].OrderId, [Source].Amount, [Source].Cost, [Source].SaleDate)
    OUTPUT Inserted.Id, [Source].OrderId, [Source].RelatedOrderId
    INTO #RelatedOrder(OrderFactId, OrderId, RelatedOrderId)
;

--Insert our mappings
INSERT INTO RelatedOrderMapping(RelatedOrderId, OrderFactId, OrderId)
SELECT RelatedOrderId, OrderFactId, OrderId FROM #RelatedOrder;
GO

--Done