SQL构建类型2维度

时间:2016-02-02 21:50:30

标签: sql sql-server sql-server-2008 type-2-dimension

我需要构建一个类型2维度表来存储各种产品的价格变化。在源数据中,我可以从中获取两个表。一个具有每个产品的当前价格,一个具有每个产品的价格变化历史。某些产品的价格变化比其他产品更高,如果产品的价格从未改变过,它根本就没有价格变化表中的记录。

鉴于以下当前价格表:

PRODUCT_ID  CURRENT_PRICE
----------  -------------
  ABC123        250
  DEF456        200
  GHI789        325

产品价格历史表:

PRODUCT_ID  OLD_PRICE   NEW_PRICE   CHANGE_DATE
----------  ---------   ---------   -----------
  ABC123        275       250        1/1/2016
  DEF456        250       225        6/1/2015
  DEF456        225       200        1/1/2016

我可以运行什么SQL来填充类型2维度,如下所示:

PRODUCT_ID  PRODUCT_PRICE   VALID_FROM  VALID_TO    CURRENT_PRICE_INDICATOR
----------  -------------   ----------  --------    ----------------------
  ABC123        275          1/1/1900   12/31/2015      N       
  ABC123        250          1/1/2016   12/31/9999      Y
  DEF456        250          1/1/1900   5/31/2015       N
  DEF456        225          6/1/2015   12/31/2015      N
  DEF456        200          1/1/2016   12/31/9999      Y
  GHI789        325          1/1/1900   12/31/9999      Y

2 个答案:

答案 0 :(得分:1)

您的最终状态是典型的2型慢速变化维度。价格历史表在这里有点红色鲱鱼,因为 NEW_PRICE 应该被忽略。只需将初始加载中的数据写入维度表,例如:

CREATE TABLE Dim_Price (
  Price_Key INT IDENTITY,
  Product_ID NVARCHAR(10) NOT NULL,
  Price INT NOT NULL,
  Row_Effective_Date DATETIME NOT NULL,
  Row_Expiry_Date DATETIME NOT NULL,
  Row_Current_Flag INT NOT NULL)

 INSERT INTO Dim_Price VALUES
 ('ABC123',275,'1 Jan 1900','31 Dec 9999',1),
 ('DEF456',250,'1 Jan 1900','31 Dec 9999',1)

从那时起,您可以使用类似于下面的merge语句的东西(此时不能验证语法)从源表合并到目标表。

有关缓慢变化的尺寸及其处理方法的更多信息,请参见Kimball Group website。毕竟,Ralph Kimball确实发明了它们:)

INSERT INTO Dim_Price
SELECT
  Product_ID
 ,Price
 ,Row_Effective_Date
 ,Row_Expiry_Date
 ,Row_Current_Flag
FROM (
MERGE Dim_Price TGT
USING STG_Price SRC ON SRC.Product_ID = TGT.Product_ID
WHEN NOT MATCHED THEN 
INSERT VALUES(
  SRC.Product_ID
 ,SRC.Price
 ,'1 Jan 1900'
 ,'31 Dec 9999'
 ,1)
 WHEN MATCHED AND TGT.Row_Current_Flag = 1 AND EXITS(
   SELECT SRC.Price
   EXCEPT
   SELECT TGT.Price)
 THEN UPDATE SET TGT.Row_Current_Flag = 0
                ,TGT.Row_Expiry_Date = DATEADD(SECOND,86399,DATEADD(DAY,-1,SECOND,CAST(GETDATE() AS DATE)))
 OUTPUT $action AS Action
        ,SRC.Product_ID
        ,SRC.Price
        ,GETDATE()
        ,'31 Dec 9999'
        ,1
) AS MERGE_OUT
WHERE MERGE_OUT.Action_Out = 'UPDATE';

答案 1 :(得分:1)

我认为是这样的:

DECLARE @price TABLE(PRODUCT_ID VARCHAR(100),CURRENT_PRICE DECIMAL(8,4));
INSERT INTO @price VALUES
 ('ABC123',250)
,('DEF456',200)
,('GHI789',325);

DECLARE @priceHist TABLE(PRODUCT_ID VARCHAR(100),OLD_PRICE DECIMAL(8,4),NEW_PRICE DECIMAL(8,4),CHANGE_DATE DATE);
INSERT INTO @priceHist VALUES
 ('ABC123',275,250,{d'2016-01-01'})
,('DEF456',250,225,{d'2015-06-01'})
,('DEF456',225,200,{d'2016-01-01'});

WITH AllData AS
(
    SELECT ROW_NUMBER() OVER(PARTITION BY Combined.PRODUCT_ID ORDER BY ISNULL(Combined.CHANGE_DATE,{d'9999-12-31'}) ASC) AS Inx
          ,*
          ,CASE WHEN CHANGE_DATE IS NULL THEN 'Y' ELSE 'N' END AS CURRENT_PRICE_INDICATOR
    FROM
    (
        SELECT p.PRODUCT_ID AS PRODUCT_ID
              ,p.CURRENT_PRICE AS PRODUCT_PRICE
              ,NULL AS CHANGE_DATE
        FROM @price AS p
        UNION ALL
        SELECT ph.PRODUCT_ID
              ,ph.OLD_PRICE
              ,ph.CHANGE_DATE
        FROM @priceHist AS ph
    ) AS Combined
)
SELECT ad.PRODUCT_ID
      ,ad.PRODUCT_PRICE
      --Version with LAG (SQL Server 2012 and higher)
      --,CASE WHEN ad.Inx=1 THEN {d'1900-01-01'} ELSE LAG(ad.CHANGE_DATE,1) OVER(PARTITION BY ad.PRODUCT_ID ORDER BY ISNULL(ad.CHANGE_DATE,{d'9999-12-31'}) ASC) END AS VALID_FROM
      ,CASE WHEN ad.Inx=1 THEN {d'1900-01-01'} ELSE LAG_Replace_For_SQLServer2008.CHANGE_DATE END AS VALID_FROM
      ,CASE WHEN ad.CURRENT_PRICE_INDICATOR='Y' THEN {d'9999-12-31'} ELSE DATEADD(DAY,-1,ad.CHANGE_DATE) END AS VALID_TO
      ,ad.CURRENT_PRICE_INDICATOR
FROM AllData AS ad
OUTER APPLY
(
    SELECT x.CHANGE_DATE 
    FROM AllData AS x
    WHERE x.PRODUCT_ID=ad.PRODUCT_ID
      AND x.Inx=ad.Inx-1
) LAG_Replace_For_SQLServer2008

结果:

ABC123  275.0000    1900-01-01  2015-12-31  N
ABC123  250.0000    2016-01-01  9999-12-31  Y
DEF456  250.0000    1900-01-01  2015-05-31  N
DEF456  225.0000    2015-06-01  2015-12-31  N
DEF456  200.0000    2016-01-01  9999-12-31  Y
GHI789  325.0000    1900-01-01  9999-12-31  Y