我有一张桌子:
+----+-------+------+
| id | times | year |
+----+-------+------+
| 5 | 2 | 2008 |
| 6 | 76 | 2008 |
| 2 | 43 | 2009 |
| 4 | 5 | 2009 |
| 1 | 3 | 2010 |
| 9 | 6 | 2010 |
| 7 | 444 | 2011 |
| 8 | 3 | 2011 |
| 3 | 65 | 2012 |
+----+-------+------+
我想在此表格中创建一个数据透视表,根据times
存储year
:
+--------+------+------+------+------+------+
| | 2008 | 2009 | 2010 | 2011 | 2012 |
+--------+------+------+------+------+------+
| 0 | | | | | |
| 1-30 | 1 | 1 | 2 | 1 | |
| 31-60 | | 1 | | | |
| 61-90 | 1 | | | | 1 |
| 91-120 | | | | | |
| 121+ | | | | 1 | |
+--------+------+------+------+------+------+
我如何开始用sql解决这个挑战?非常感谢你的指导。
答案 0 :(得分:6)
您可以使用sql server PIVOT
函数。如果您知道这些年份的所有值以及存储桶,那么您可以对查询进行硬编码:
select *
from
(
select
case
when times = 0 then '0'
when times >= 1 and times <=30 then '1-30'
when times >= 31 and times <=60 then '31-60'
when times >= 61 and times <=90 then '61-90'
when times >= 91 and times <=120 then '91-120'
else '121+' end bucket,
year
from yourtable
) src
pivot
(
count(year)
for year in ([2008], [2009], [2010], [2011], [2012])
) piv;
如果您无法访问PIVOT
功能,则可以使用带CASE
的聚合函数:
select bucket,
sum(case when year = 2008 then 1 else 0 end) [2008],
sum(case when year = 2009 then 1 else 0 end) [2009],
sum(case when year = 2010 then 1 else 0 end) [2010],
sum(case when year = 2011 then 1 else 0 end) [2011],
sum(case when year = 2012 then 1 else 0 end) [2012]
from
(
select
case
when times = 0 then '0'
when times >= 1 and times <=30 then '1-30'
when times >= 31 and times <=60 then '31-60'
when times >= 61 and times <=90 then '61-90'
when times >= 91 and times <=120 then '91-120'
else '121+' end bucket,
year
from yourtable
) src
group by bucket
如果您需要列出所有存储桶,那么您需要将存储桶范围存储在表中或使用CTE
查询,然后您可以使用以下命令:
;with buckets(startbucket, endbucket, rnk) as
(
select 0, 0, 1
union all
select 1, 30, 2
union all
select 31, 60, 3
union all
select 61, 90, 4
union all
select 91, 120, 5
union all
select 121, null, 6
)
select
case when startbucket = 0 then '0'
when endbucket is null then cast(startbucket as varchar(50)) + '+'
else cast(startbucket as varchar(50)) + '-'+cast(endbucket as varchar(50)) end buckets,
[2008], [2009], [2010], [2011], [2012]
from
(
select rnk,
year,
startbucket,
endbucket
from buckets b
left join yourtable t
on t.times >= b.startbucket and t.times <= coalesce(b.endbucket, 100000)
) src
pivot
(
count(year)
for year in ([2008], [2009], [2010], [2011], [2012])
) piv;
结果:
| BUCKETS | 2008 | 2009 | 2010 | 2011 | 2012 |
----------------------------------------------
| 0 | 0 | 0 | 0 | 0 | 0 |
| 1-30 | 1 | 1 | 2 | 1 | 0 |
| 31-60 | 0 | 1 | 0 | 0 | 0 |
| 61-90 | 1 | 0 | 0 | 0 | 1 |
| 91-120 | 0 | 0 | 0 | 0 | 0 |
| 121+ | 0 | 0 | 0 | 1 | 0 |
如果您需要转置已知数量的值(年),则上述步骤将很有效。如果您有一个未知的数字,那么您将需要实现动态sql,类似于:
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
select @cols = STUFF((SELECT distinct ',' + QUOTENAME(year)
from yourtable
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'with buckets(startbucket, endbucket, rnk) as
(
select 0, 0, 1
union all
select 1, 30, 2
union all
select 31, 60, 3
union all
select 61, 90, 4
union all
select 91, 120, 5
union all
select 121, null, 6
)
select
case when startbucket = 0 then ''0''
when endbucket is null then cast(startbucket as varchar(50)) + ''+''
else cast(startbucket as varchar(50)) + ''-''+cast(endbucket as varchar(50)) end buckets,
'+@cols+'
from
(
select rnk,
year,
startbucket, endbucket
from buckets b
left join yourtable t
on t.times >= b.startbucket and t.times <= coalesce(b.endbucket, 100000)
) src
pivot
(
count(year)
for year in ('+@cols+')
) piv;'
execute(@query)
静态(硬编码)版本和动态版本的结果相同。
答案 1 :(得分:3)
Darn it! Bluefeet打败了我。 My attempt类似,但使用表来配置存储桶。
CREATE TABLE Bucket
(
id int,
minbound int,
maxbound int
)
INSERT INTO Bucket VALUES(1, 0, 30)
,(2, 31, 60)
,(3, 61, 90)
,(4, 91, 120)
,(5, 121, null)
然后可以计算CTE中每条记录的桶数......
;WITH RecordBucket
AS
(
SELECT
r.*,
b.id as bucketid
FROM
Record r
INNER JOIN Bucket b ON r.times BETWEEN b.minbound and ISNULL(b.maxbound, 20000000)
)
...和外部联接返回到最终查询的桶,以允许包含订购和空桶:
select
b.id as BucketId,
CASE
WHEN b.maxbound IS NULL THEN CONVERT(VARCHAR(16), b.minbound) + '+'
ELSE CONVERT(VARCHAR(16), b.minbound) + ' - ' + CONVERT(VARCHAR(16), b.maxbound)
END as BucketName,
[2008],[2009],[2010],[2011]
from
Bucket b
LEFT JOIN
(
SELECT
bucketid,
times,
year
from
RecordBucket
) rb
pivot (count(times) for year in ([2008],[2009],[2010],[2011]))
as pvt ON b.id = pvt.bucketid
order by
bucketid