在SQL Server中生成直方图

时间:2013-04-28 22:02:54

标签: sql-server

我正在使用SQL Server 2012并需要生成直方图,概念上类似于Google's screener

我们的想法是将所有价格分成100个相同大小(基于价格)的桶,然后每个桶包含许多在桶的最小值和最大值内定价的物品。 NTILE没有工作 - 它试图在桶之间平均分配项目(基于计数)。

所以,这就是我到目前为止所做的:

select bucket, count(*) from (select cast((PERCENT_RANK() OVER(ORDER BY Price DESC)) *   100 as int) as bucket  from MyTable
where DataDate = '4/26/2012') t group by bucket

这是在SQL Server 2012中生成直方图的好方法吗?有没有内置的SQL Server 2012来执行此任务或更好的方法?

谢谢

1 个答案:

答案 0 :(得分:3)

或许这样:

with cte as (
  select base = 1 + u + t*3 from (
    select 0 as u union all select 1 union all select 2
  ) T1
  cross join (
    select 0 as t union all select 1 union all select 2
  ) T2
), data as (
  select * 
  from ( 
   values (1,1,2,3,3,5,7,4,2,1)
  ) data(x0,x1,x2,x3,x4,x5,x6,x7,x8,x9)
)
select cte.base
  ,case when x0>=base then 'X' else  ' ' end as x0
  ,case when x1>=base then 'X' else  ' ' end as x1
  ,case when x2>=base then 'X' else  ' ' end as x2
  ,case when x3>=base then 'X' else  ' ' end as x3
  ,case when x4>=base then 'X' else  ' ' end as x4
  ,case when x5>=base then 'X' else  ' ' end as x5
  ,case when x6>=base then 'X' else  ' ' end as x6
  ,case when x7>=base then 'X' else  ' ' end as x7
  ,case when x8>=base then 'X' else  ' ' end as x8
  ,case when x9>=base then 'X' else  ' ' end as x9
from cte
cross join data
order by base desc
;

很好地产生了这个直方图:

base        x0   x1   x2   x3   x4   x5   x6   x7   x8   x9
----------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
9                                                         
8                                                         
7                                         X               
6                                         X               
5                                    X    X               
4                                    X    X    X          
3                          X    X    X    X    X          
2                     X    X    X    X    X    X    X     
1           X    X    X    X    X    X    X    X    X    X

请务必先将数据转换为单行。

对于更紧凑的表示,将各种数据列连接成一个长字符串。