计算SSAS中的百分位数值

时间:2014-03-05 16:45:48

标签: ssas mdx business-intelligence olap-cube percentile

我正在尝试计算立方体中的百分位数(例如我的度量的第90个百分点),我想我几乎就在那里。我面临的问题是,我能够返回第90个百分位的行号,但不知道如何得到我的度量。

With
Member [Measures].[cnt] as 
Count(NonEmpty(
-- dimensions to find percentile on (the same should be repeated again
[Calendar].[Hierarchy].members * 
[Region Dim].[Region].members * 
[Product Dim].[Product].members
,
-- add the measure to group
[Measures].[Profit]))

-- define percentile
Member [Measures].[Percentile] as 90

Member [Measures].[PercentileInt] as Int((([Measures].[cnt]) * [Measures].[Percentile]) / 100)

**-- this part finds the tuple from the set based on the index of the percentile point and I am using the item(index) to get the necessary info from tuple and I am unable to get the measure part 
Member [Measures].[PercentileLo] as
(
Order(
NonEmpty(
    [Calendar].[Hierarchy].members * 
    [Region Dim].[Region].members * 
    [Product Dim].[Product].members,
    [Measures].[Profit]),
    [Measures].[Profit].Value, BDESC)).Item([Measures].[PercentileInt]).Item(3)**

select
{
[Measures].[cnt],
[Measures].[Percentile],[Measures].[PercentileInt],
[Measures].[PercentileLo],
[Measures].[Profit]
}
on 0
from
[TestData]

我认为必须有一种方法来衡量通过集合的索引找到的元组。请帮忙,如果您需要更多信息,请与我们联系。谢谢!

2 个答案:

答案 0 :(得分:1)

你应该从你的集合中提取位置[Measures].[PercentileInt]的元组,并将度量添加到它以构建一个包含四个元素的元组。然后你想要将其值作为度量PercentileLo返回,i。即定义

Member [Measures].[PercentileLo] as
(
[Measures].[Profit],
Order(
NonEmpty(
    [Calendar].[Hierarchy].members * 
    [Region Dim].[Region].members * 
    [Product Dim].[Product].members,
    [Measures].[Profit]),
    [Measures].[Profit], BDESC)).Item([Measures].[PercentileInt])
)

您实现它的方式,您尝试从仅包含三个元素的元组中提取第四个(从{0开始计数的Item())项。您的有序集只有三个层次结构。

另一个不相关的评论:我认为你应该避免使用[Calendar].[Hierarchy].members[Region Dim].[Region].members[Product Dim].[Product].members的完整层次结构。您的代码看起来像是在计算中包括所有级别(包括所有成员)。但我不知道你的立方体的结构和名称,因此我可能错了。

答案 1 :(得分:1)

另一种方法可能是查找表中最后20%记录的中位数。我已经使用这种功能组合来找到第75百分位数。通过将记录计数除以5,您可以使用TopCount函数返回一组元组,这些元组构成整个表的20%,按目标度量按降序排序。然后,中位函数应使您处于正确的第90百分位值,而无需找到记录的坐标。在我自己的使用中,我对Median和TopCount函数中的最后一个参数使用相同的度量。

这是我的代码:

WITH MEMBER Measures.[90th Percentile] AS MEDIAN(
    TOPCOUNT(
        [set definition]
        ,Measures.[Fact Table Record Count] / 5
        ,Measures.[Value by which to sort the set so the first 20% of records are chosen]
    )
    ,Measures.[Value from which the median should be determined]
)

根据您在问题定义中提供的内容,我希望您的代码看起来像这样:

WITH MEMBER Measures.[90th Percentile] AS MEDIAN(
    TOPCOUNT(
        {
           [Calendar].[Hierarchy].members * 
           [Region Dim].[Region].members * 
           [Product Dim].[Product].members
        }
        ,Measures.[Fact Table Record Count] / 5
        ,[Measures].[Profit]
    )
    ,[Measures].[Profit]
)