我需要构建一个类型2维度表来存储各种产品的价格变化。在源数据中,我可以从中获取两个表。一个具有每个产品的当前价格,一个具有每个产品的价格变化历史。某些产品的价格变化比其他产品更高,如果产品的价格从未改变过,它根本就没有价格变化表中的记录。
鉴于以下当前价格表:
PRODUCT_ID CURRENT_PRICE
---------- -------------
ABC123 250
DEF456 200
GHI789 325
产品价格历史表:
PRODUCT_ID OLD_PRICE NEW_PRICE CHANGE_DATE
---------- --------- --------- -----------
ABC123 275 250 1/1/2016
DEF456 250 225 6/1/2015
DEF456 225 200 1/1/2016
我可以运行什么SQL来填充类型2维度,如下所示:
PRODUCT_ID PRODUCT_PRICE VALID_FROM VALID_TO CURRENT_PRICE_INDICATOR
---------- ------------- ---------- -------- ----------------------
ABC123 275 1/1/1900 12/31/2015 N
ABC123 250 1/1/2016 12/31/9999 Y
DEF456 250 1/1/1900 5/31/2015 N
DEF456 225 6/1/2015 12/31/2015 N
DEF456 200 1/1/2016 12/31/9999 Y
GHI789 325 1/1/1900 12/31/9999 Y
答案 0 :(得分:1)
您的最终状态是典型的2型慢速变化维度。价格历史表在这里有点红色鲱鱼,因为 NEW_PRICE 应该被忽略。只需将初始加载中的数据写入维度表,例如:
CREATE TABLE Dim_Price (
Price_Key INT IDENTITY,
Product_ID NVARCHAR(10) NOT NULL,
Price INT NOT NULL,
Row_Effective_Date DATETIME NOT NULL,
Row_Expiry_Date DATETIME NOT NULL,
Row_Current_Flag INT NOT NULL)
INSERT INTO Dim_Price VALUES
('ABC123',275,'1 Jan 1900','31 Dec 9999',1),
('DEF456',250,'1 Jan 1900','31 Dec 9999',1)
从那时起,您可以使用类似于下面的merge语句的东西(此时不能验证语法)从源表合并到目标表。
有关缓慢变化的尺寸及其处理方法的更多信息,请参见Kimball Group website。毕竟,Ralph Kimball确实发明了它们:)
INSERT INTO Dim_Price
SELECT
Product_ID
,Price
,Row_Effective_Date
,Row_Expiry_Date
,Row_Current_Flag
FROM (
MERGE Dim_Price TGT
USING STG_Price SRC ON SRC.Product_ID = TGT.Product_ID
WHEN NOT MATCHED THEN
INSERT VALUES(
SRC.Product_ID
,SRC.Price
,'1 Jan 1900'
,'31 Dec 9999'
,1)
WHEN MATCHED AND TGT.Row_Current_Flag = 1 AND EXITS(
SELECT SRC.Price
EXCEPT
SELECT TGT.Price)
THEN UPDATE SET TGT.Row_Current_Flag = 0
,TGT.Row_Expiry_Date = DATEADD(SECOND,86399,DATEADD(DAY,-1,SECOND,CAST(GETDATE() AS DATE)))
OUTPUT $action AS Action
,SRC.Product_ID
,SRC.Price
,GETDATE()
,'31 Dec 9999'
,1
) AS MERGE_OUT
WHERE MERGE_OUT.Action_Out = 'UPDATE';
答案 1 :(得分:1)
我认为是这样的:
DECLARE @price TABLE(PRODUCT_ID VARCHAR(100),CURRENT_PRICE DECIMAL(8,4));
INSERT INTO @price VALUES
('ABC123',250)
,('DEF456',200)
,('GHI789',325);
DECLARE @priceHist TABLE(PRODUCT_ID VARCHAR(100),OLD_PRICE DECIMAL(8,4),NEW_PRICE DECIMAL(8,4),CHANGE_DATE DATE);
INSERT INTO @priceHist VALUES
('ABC123',275,250,{d'2016-01-01'})
,('DEF456',250,225,{d'2015-06-01'})
,('DEF456',225,200,{d'2016-01-01'});
WITH AllData AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY Combined.PRODUCT_ID ORDER BY ISNULL(Combined.CHANGE_DATE,{d'9999-12-31'}) ASC) AS Inx
,*
,CASE WHEN CHANGE_DATE IS NULL THEN 'Y' ELSE 'N' END AS CURRENT_PRICE_INDICATOR
FROM
(
SELECT p.PRODUCT_ID AS PRODUCT_ID
,p.CURRENT_PRICE AS PRODUCT_PRICE
,NULL AS CHANGE_DATE
FROM @price AS p
UNION ALL
SELECT ph.PRODUCT_ID
,ph.OLD_PRICE
,ph.CHANGE_DATE
FROM @priceHist AS ph
) AS Combined
)
SELECT ad.PRODUCT_ID
,ad.PRODUCT_PRICE
--Version with LAG (SQL Server 2012 and higher)
--,CASE WHEN ad.Inx=1 THEN {d'1900-01-01'} ELSE LAG(ad.CHANGE_DATE,1) OVER(PARTITION BY ad.PRODUCT_ID ORDER BY ISNULL(ad.CHANGE_DATE,{d'9999-12-31'}) ASC) END AS VALID_FROM
,CASE WHEN ad.Inx=1 THEN {d'1900-01-01'} ELSE LAG_Replace_For_SQLServer2008.CHANGE_DATE END AS VALID_FROM
,CASE WHEN ad.CURRENT_PRICE_INDICATOR='Y' THEN {d'9999-12-31'} ELSE DATEADD(DAY,-1,ad.CHANGE_DATE) END AS VALID_TO
,ad.CURRENT_PRICE_INDICATOR
FROM AllData AS ad
OUTER APPLY
(
SELECT x.CHANGE_DATE
FROM AllData AS x
WHERE x.PRODUCT_ID=ad.PRODUCT_ID
AND x.Inx=ad.Inx-1
) LAG_Replace_For_SQLServer2008
结果:
ABC123 275.0000 1900-01-01 2015-12-31 N
ABC123 250.0000 2016-01-01 9999-12-31 Y
DEF456 250.0000 1900-01-01 2015-05-31 N
DEF456 225.0000 2015-06-01 2015-12-31 N
DEF456 200.0000 2016-01-01 9999-12-31 Y
GHI789 325.0000 1900-01-01 9999-12-31 Y