在一年做一个简单的支点

时间:2012-12-04 19:15:00

标签: sql sql-server sql-server-2008 tsql pivot

我有一张桌子:

+----+-------+------+
| id | times | year |
+----+-------+------+
|  5 |     2 | 2008 |
|  6 |    76 | 2008 |
|  2 |    43 | 2009 |
|  4 |     5 | 2009 |
|  1 |     3 | 2010 |
|  9 |     6 | 2010 |
|  7 |   444 | 2011 |
|  8 |     3 | 2011 |
|  3 |    65 | 2012 |
+----+-------+------+

我想在此表格中创建一个数据透视表,根据times存储year

+--------+------+------+------+------+------+
|        | 2008 | 2009 | 2010 | 2011 | 2012 |
+--------+------+------+------+------+------+
| 0      |      |      |      |      |      |
| 1-30   |    1 |    1 |    2 |    1 |      |
| 31-60  |      |    1 |      |      |      |
| 61-90  |    1 |      |      |      |    1 |
| 91-120 |      |      |      |      |      |
| 121+   |      |      |      |    1 |      |
+--------+------+------+------+------+------+

我如何开始用sql解决这个挑战?非常感谢你的指导。

2 个答案:

答案 0 :(得分:6)

您可以使用sql server PIVOT函数。如果您知道这些年份的所有值以及存储桶,那么您可以对查询进行硬编码:

select *
from
(
  select 
    case 
      when times = 0 then '0' 
      when times >= 1 and times <=30 then '1-30'
      when times >= 31 and times <=60 then '31-60'  
      when times >= 61 and times <=90 then '61-90' 
      when times >= 91 and times <=120 then '91-120' 
      else '121+' end bucket,
    year
  from yourtable
) src
pivot
(
  count(year)
  for year in ([2008], [2009], [2010], [2011], [2012])
) piv;

请参阅SQL Fiddle with Demo

如果您无法访问PIVOT功能,则可以使用带CASE的聚合函数:

select bucket,
  sum(case when year = 2008 then 1 else 0 end) [2008],
  sum(case when year = 2009 then 1 else 0 end) [2009],
  sum(case when year = 2010 then 1 else 0 end) [2010],
  sum(case when year = 2011 then 1 else 0 end) [2011],
  sum(case when year = 2012 then 1 else 0 end) [2012]
from
(
  select 
    case 
      when times = 0 then '0' 
      when times >= 1 and times <=30 then '1-30' 
      when times >= 31 and times <=60 then '31-60'  
      when times >= 61 and times <=90 then '61-90' 
      when times >= 91 and times <=120 then '91-120' 
      else '121+' end bucket,
    year
  from yourtable
) src
group by bucket

请参阅SQL Fiddle with Demo

如果您需要列出所有存储桶,那么您需要将存储桶范围存储在表中或使用CTE查询,然后您可以使用以下命令:

;with buckets(startbucket, endbucket, rnk) as
(
  select 0, 0, 1 
  union all
  select 1, 30, 2
  union all
  select 31, 60, 3
  union all
  select 61, 90, 4
  union all
  select 91, 120, 5
  union all
  select 121, null, 6
)
select 
  case when startbucket = 0 then '0'
    when endbucket is null then cast(startbucket as varchar(50)) + '+'
    else cast(startbucket as varchar(50)) + '-'+cast(endbucket as varchar(50)) end buckets,
  [2008], [2009], [2010], [2011], [2012]
from
(
  select rnk,
    year, 
    startbucket, 
    endbucket
  from buckets b
  left join yourtable t
    on t.times >= b.startbucket and t.times <= coalesce(b.endbucket, 100000)
) src
pivot
(
  count(year)
  for year in ([2008], [2009], [2010], [2011], [2012])
) piv;

请参阅SQL Fiddle with Demo

结果:

| BUCKETS | 2008 | 2009 | 2010 | 2011 | 2012 |
----------------------------------------------
|       0 |    0 |    0 |    0 |    0 |    0 |
|    1-30 |    1 |    1 |    2 |    1 |    0 |
|   31-60 |    0 |    1 |    0 |    0 |    0 |
|   61-90 |    1 |    0 |    0 |    0 |    1 |
|  91-120 |    0 |    0 |    0 |    0 |    0 |
|    121+ |    0 |    0 |    0 |    1 |    0 |

如果您需要转置已知数量的值(年),则上述步骤将很有效。如果您有一个未知的数字,那么您将需要实现动态sql,类似于:

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(year) 
                    from yourtable
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'with buckets(startbucket, endbucket, rnk) as
              (
                select 0, 0, 1 
                union all
                select 1, 30, 2
                union all
                select 31, 60, 3
                union all
                select 61, 90, 4
                union all
                select 91, 120, 5
                union all
                select 121, null, 6
              )
              select 
                case when startbucket = 0 then ''0''
                  when endbucket is null then cast(startbucket as varchar(50)) + ''+''
                  else cast(startbucket as varchar(50)) + ''-''+cast(endbucket as varchar(50)) end buckets,
                '+@cols+'
              from
              (
                select rnk,
                  year, 
                  startbucket, endbucket
                from buckets b
                left join yourtable t
                  on t.times >= b.startbucket and t.times <= coalesce(b.endbucket, 100000)
              ) src
              pivot
              (
                count(year)
                for year in ('+@cols+')
              ) piv;'

execute(@query)

请参阅SQL Fiddle with Demo

静态(硬编码)版本和动态版本的结果相同。

答案 1 :(得分:3)

Darn it! Bluefeet打败了我。 My attempt类似,但使用表来配置存储桶。

CREATE TABLE Bucket
(
    id int,
    minbound int,
    maxbound int
)

INSERT INTO Bucket VALUES(1, 0, 30)
                    ,(2, 31, 60)
                    ,(3, 61, 90)
                    ,(4, 91, 120)
                    ,(5, 121, null)

然后可以计算CTE中每条记录的桶数......

;WITH RecordBucket
AS
(
    SELECT
        r.*,
        b.id as bucketid
    FROM
        Record r
        INNER JOIN Bucket b ON r.times BETWEEN b.minbound and ISNULL(b.maxbound, 20000000)
)

...和外部联接返回到最终查询的桶,以允许包含订购和空桶:

select 
    b.id as BucketId, 
    CASE
        WHEN b.maxbound IS NULL THEN CONVERT(VARCHAR(16), b.minbound) + '+'
        ELSE CONVERT(VARCHAR(16), b.minbound) + ' - ' + CONVERT(VARCHAR(16), b.maxbound)
    END as BucketName,
    [2008],[2009],[2010],[2011] 
from 
    Bucket b
    LEFT JOIN
    (
        SELECT
            bucketid,
            times,
            year
        from
            RecordBucket
    ) rb 
    pivot (count(times) for year in ([2008],[2009],[2010],[2011])) 
    as pvt ON b.id = pvt.bucketid
order by
    bucketid