加载维度表后面的逻辑

时间:2013-10-08 14:15:53

标签: oracle11g data-warehouse informatica-powercenter dimensional-modeling

如何从关系源填充Dim_tbls?

给出了这些示例表:

tbl_sales:    id_sales, fk_id_customer, fk_id_product, country, timestamp   
tbl_customer: id_customer, name, adress, zip, city
tbl_product:  id_product, price, product

我的目标是将这些属性放入start-schema中。我遇到的问题是加载维度表背后的逻辑。我的意思是,我会在Dim_Product中加载哪些数据? tbl_product中的所有产品?但是,我怎么知道特定产品的销售量是多少?

我想做的分析是:

 How many people bought product x.
 How many sales are made from city x.
 How many sales were made between Time x and y. 

示例数据:

 tbl_sales: id_sales | fk_id_customer | fk_id_product | country | timestamp 
                1    |       2        |      1        |   UK    | 19.11.2013 10:23:22
                2    |       1        |      2        |   FR    | 20.11.2013 06:04:22

 tbl_customer: id_customer | name | adress | zip | city
                      1    | Frank|Street X| 211 | London
                      2    | Steve|Street Y| 431 | Paris

 tbl_customer: id_product| Price | product
                      1  | 100,00| Hammer
                      2  |  50,00| Saw

1 个答案:

答案 0 :(得分:2)

让我们从一个非常简单的星型模式模型开始;例如,我认为你不必担心处理尺寸变化问题。属性。

factSales

  DateKey
  CustomerKey
  ProductKey
  Counter (=1; this is a factless fact table)

dimDate

  DateKey
  Date
  Year
  Quarter
  Month
  ...

dimCustomer

  CustomerKey
  Name
  Address
  Zip
  City

dimProduct

  ProductKey
  Name
  Price (if it changes, you need move it to factSales)

有多少人购买了产品x。

SELECT DISTINCT CustomerKey
FROM factSales
WHERE ProductKey IN ( SELECT ProductKey
                      FROM dimProduct
                      WHERE Name = 'Product X' )

从x城市进行了多少次销售。

SELECT SUM(Counter)
FROM factSales
WHERE CustomerKey IN ( SELECT CustomerKey
                       FROM dimCustomer
                       WHERE City = 'City X' )

在时间x和y之间进行了多少次销售。

SELECT SUM(Counter)
FROM factSales
WHERE DateKey IN ( SELECT DateKey
                   FROM dimDate
                   WHERE Date BETWEEN DateX AND DateY )