背景资料:
有一张表" ProductCosts"。 第一个样本数据集显示正确插入的数据。 数据通过excel输入并由ETL过程摄取。 该表显示了不同的成本。 成本" 4_Cost"是最新的,然后是" 3_Costs"等等。
在这个案例" 3-Costs"是最近给定的费用:
Category Product ISOMonth 1_Costs 2_Costs 3_Costs 4_Costs
----------------------------------------------------------------------------------------
ProductCategory1 Stuff 2017-10 40,000.00 40,000.00 50,000.00 NULL
ProductCategory1 Stuff 2017-10 10,000.00 10,000.00 00.00 NULL
ProductCategory1 Stuff 2017-10 10,000.00 10,000.00 00.00 NULL
你会在第二行和第三行看到10,000.00来自" 2_Costs"被" 3_Costs"替换为00.00。 要识别CurrentCosts,应用以下简单逻辑(参见COALESCE):
SELECT Category
. Product
. ISOMonth
. COALESCE([4_Costs].[3_Costs]. [2_Costs]. [1_Costs]) AS CurrentRRCosts
FROM [ProductCosts]
正确的结果:
Category Product ISOMonth CurrentCosts
-----------------------------------------------------------
ProductCategory1 Stuff 2017-10 50,000.00
ProductCategory1 Stuff 2017-10 00.00
ProductCategory1 Stuff 2017-10 00.00
最后将CurrentCost总计为50,000.00 如果Inputdata是正确的,这很有效。
数据错误:
Category Product ISOMonth 1_Costs 2_Costs 3_Costs 4_Costs CurrentCosts
---------------------------------------------------------------------------------------------------------
ProductCategory1 Stuff 2017-10 40,000.00 40,000.00 50,000.00 NULL 50,000.00
ProductCategory1 Stuff 2017-10 10,000.00 10,000.00 NULL NULL 10,000.00
ProductCategory1 Stuff 2017-10 10,000.00 10,000.00 NULL NULL 10,000.00
在这种情况下,用户忘记输入列" 3_Costs"的第二行和第三行中的00.00。 这导致CurrentCosts列中的错误结果:
Category Product ISOMonth CurrentCosts
--------------------------------------------------------
ProductCategory1 Stuff 2017-10 50,000.00
ProductCategory1 Stuff 2017-10 10,000.00
ProductCategory1 Stuff 2017-10 10,000.00
最后将CurrentCost总计为70,000.00 ,这是一个错误的结果,因为用户忘记用00.00
覆盖prevoius 10,000.00断言: 如果列的一个值类似于" 3_Costs"是非空(这里是例如50,000.00) 按类别,产品和月份,其他值不应为NULL。
错误的数据示例: 查看数据集"错误数据"。如果有" 3_Costs"在第一行中,seconde和第三行中也必须有一个值。
返回标记的SQL查询,例如" has_incomplete_cost_column"没关系。 然后我会知道数据不一致。
决定因素: 我必须保持存在的数据模型和概念因为它已经以这种方式实现。 输入数据由Excel工作表提供,因此它不是建立捕获这些错误的用户界面。
答案 0 :(得分:2)
分析和案例或子查询如何获得每列总数,然后使用案例/每次使用相同的列?
根本问题是你需要在列的总和上发生合并,而不是单个行;然后只显示行值而不是总和。
With ProductCosts(Category,Product, ISOMonth, [1_Costs], [2_Costs], [3_Costs], [4_Costs]) as (
SELECT 'ProductCategory1', 'Stuff', '2017-10', 40000.00, 40000.00, 50000.00, cast(NULL as numeric(10,2)) UNION ALL
SELECT 'ProductCategory1', 'Stuff', '2017-10', 10000.00, 10000.00, NULL , cast(NULL as numeric(10,2)) UNION ALL
SELECT 'ProductCategory1', 'Stuff', '2017-10', 10000.00, 10000.00, NULL, cast(NULL as numeric(10,2)) UNION ALL
SELECT 'ProductCategory1', 'Stuff', '2017-10', NULL, NULL, NULL, cast(NULL as numeric(10,2)))
Select Category, Product, ISOMonth, Case when sum([4_costs]) over (partition by Category, Product, ISOMonth) > 0 then [4_costs]
when sum([3_Costs]) over (partition by Category, Product, ISOMonth)> 0 then [3_Costs]
when sum([2_costs]) over (partition by Category, Product, ISOMonth)> 0 then [2_costs]
when sum([1_Costs]) over (partition by Category, Product, ISOMonth)> 0 then [1_costs]
end as currentprice
from productCosts A
给予我们(采用顶部或底部方法)
+----+------------------+---------+----------+--------------+
| | Category | Product | ISOMonth | currentprice |
+----+------------------+---------+----------+--------------+
| 1 | ProductCategory1 | Stuff | 2017-10 | 50000,00 |
| 2 | ProductCategory1 | Stuff | 2017-10 | NULL |
| 3 | ProductCategory1 | Stuff | 2017-10 | NULL |
| 4 | ProductCategory1 | Stuff | 2017-10 | NULL |
+----+------------------+---------+----------+--------------+
很少注意到:
替代方法;我不确定分析重复或子查询是否更快,无需测试。我认为子查询我相信它们只会发生一次而分析必须为每一行运行;但也许引擎知道并相应地进行了优化。
Select PC.Category, PC.Product, PC.ISOMonth, Case when D.[4_costs] > 0 then PC.[4_costs]
when C.[3_Costs]> 0 then PC.[3_Costs]
when B.[2_Costs]> 0 then PC.[2_costs]
when A.[1_Costs]> 0 then PC.[1_costs]
end as currentprice
from productCosts PC
INNER join (Select sum([4_costs]) [4_costs], Category, product, ISOMonth from ProductCosts GROUP BY Category, product, ISOMonth ) D
on D.Category = PC.Category
and D.Product = PC.Product
and D.ISOMonth = PC.ISOMonth
INNER join (Select sum([3_costs]) [3_costs], Category, product, ISOMonth from ProductCosts Group by Category, product, ISOMonth) C
on C.Category = PC.Category
and C.Product = PC.Product
and C.ISOMonth = PC.ISOMonth
INNER join (Select sum([2_costs]) [2_costs], Category, product, ISOMonth from ProductCosts Group by Category, product, ISOMonth ) B
on B.Category = PC.Category
and B.Product = PC.Product
and B.ISOMonth = PC.ISOMonth
INNER join (Select sum([1_costs]) [1_costs], Category, product, ISOMonth from ProductCosts Group by Category, product, ISOMonth ) A
on A.Category = PC.Category
and A.Product = PC.Product
and A.ISOMonth = PC.ISOMonth