我一直在搜索Stack Overflow以及谷歌,并且没有找到我的问题的答案,所以我们走了:
我已经做了一分钟,因为我从头开始做了一个'数据仓库项目,所以我对我过去的一些知识不屑一顾,但我正在解决我的一个数据加载方案。
我正在创建一个Fact Table(factOrderLines),当然还有很多维度。我想链接到factOrderLines的其中一个维度是dimItem。问题是项目是唯一的,基于项目的供应商和供应商部件号,制造商和制造商部件号,或来自名为ManagedItems(MngItemID)的项子集的标识符。
来源:
Vendor VendorPartNo Manufacturer ManufacturerPartNo MngItemID 100 3456 NULL NULL 67 100 3254 03 1234 23 NULL NULL 03 1235 24 NULL NULL 15 5120 NULL
问题是当我从源表连接到dimItem表以填充factOrderLines表时,我有三个查找方案。这导致数字膨胀,性能变得可怕。
LEFT OUTER JOIN dimItem AS i ON
(i.Vendor = src.Vendor AND i.VendorPartNo = src.VndrItemID) OR
(i.Manufacturer = src.Manufacturer AND
(i.ManufacturerPartNo = src.MfgItemID) OR (i.MngItemID = src.MngItemID)
对于这种情况,是否有比我开始实施的更有效/更好的方法?
编辑:完整的INSERT查询(为了更好地理解)
INSERT INTO fctOrderLine
(PurchaseOrderKey
,DateKey
,PurchaseOrderLineNo
,VendorKey
,ManufacturerKey
,ItemKey
,UnitPrice
,Qty
,UnitOfMeasure
,LineTotal)
SELECT PurchaseOrderKey = po.PurchaseOrderKey
,DateKey = ISNULL(c.DateKey, 19000101)
,PurchaseOrderLineNo = ISNULL(p.POLineNbr, -1)
,VendorKey = ISNULL(v.VendorKey, -1)
,ManufacturerKey = ISNULL(m.ManufacturerKey, -1)
,ItemKey = ISNULL(i.ItemKey, -1)
,UnitPrice = ISNULL(p.UnitPrice, -1.00)
,Qty = ISNULL(p.POQty, -1.00)
,UnitOfMeasure = ISNULL(p.ANSI_UOM, N'UNKNOWN')
,LineTotal = ISNULL(p.LineTotalCost, -1)
FROM stgOrders AS p
INNER JOIN dimPurchaseOrder AS po ON po.OrderNo = p.PONumber
LEFT OUTER JOIN dimCalendar AS c ON c.Date = (CASE WHEN p.DT_PO IS NULL OR ISDATE(REPLACE(p.DT_PO, '''', '')) = 0 THEN CAST('19000101' AS DATETIME) ELSE REPLACE(p.DT_PO, '''', '') END)
LEFT OUTER JOIN dimVendor AS v ON v.VendorID = p.VendorID
LEFT OUTER JOIN dimManufacturer AS m ON m.ManufacturerID = p.MfgID
LEFT OUTER JOIN dimItem AS i ON (i.VendorKey = v.VendorKey AND i.VendorPartNo = p.VndrItemID) OR (i.ManufacturerKey = m.ManufacturerKey AND i.ManufacturerPartNo = p.MfgItemID) OR (i.MngItemID = p.MngItemID)