在SQL Server中透视表的前十个值

时间:2018-11-19 08:38:55

标签: sql sql-server tsql

考虑两个表,一个表包含要执行的工作的详细信息(cases,另一个表描述了每种情况下执行的工作(activities)。

cases表大约有2000万行。

CREATE TABLE #cases
    (CASEID int, DETAILS varchar(1))

INSERT INTO #cases
    (CASEID, DETAILS)
VALUES
    (1, 'A'),
    (2, 'B'),
    (3, 'C')
;

activities表大约有1.8亿行。

CREATE TABLE #activities
    (ACTIVITYID int, CASEID int, CODE varchar(3), STARTDATE date)

INSERT INTO #activities
    (ACTIVITYID, CASEID, CODE, STARTDATE)
VALUES
    (1, 1, '00', '2018-01-01'),
    (2, 1, '110', '2018-02-01'),
    (3, 1, '900', '2018-03-01'),
    (4, 1, '910', '2018-05-01'),
    (5, 1, '920', '2018-04-01'),
    (6, 2, '900', '2018-01-01'),
    (7, 2, '110', '2018-02-01'),
    (8, 2, '900', '2018-03-01'),
    (9, 3, '00', '2018-01-01'),
    (10, 3, '123', '2018-02-01')
;

这不是理想的选择-但我需要找到一种方法来创建一个包含案例详细信息的宽表,然后创建前10个活动的详细信息,其代码范围在900-999之间。

在某些情况下,该范围内的活动将超过10个,而在某些情况下,则没有任何活动。

我正在寻找的输出类似于:

CASEID  DETAILS CODE1st900  STARTDATE1st900 CODE2nd900  STARTDATE2nd900 CODE3rd900  STARTDATE3rd900
1   A   900 01/01/2018 00:00:00 920 01/04/2018 00:00:00 910 01/05/2018 00:00:00
2   B   900 01/01/2018 00:00:00 900 01/03/2018 00:00:00     
3   C       

最终,我不确定某种聪明的枢轴是否是最好的方法,即将每组值与一个子查询或一个游标联接在一起,这通常是我的组织以前创建此类数据的方式。

要在此处使用的DBFiddle:

https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=5eef2de402726218a8472880ef0bab85

2 个答案:

答案 0 :(得分:3)

通常,我们希望使用PIVOT,但目前尚无可同时旋转多个列的语法。因此,我们将改为使用条件聚合:

declare @cases table (CASEID int, DETAILS varchar(1))
INSERT INTO @cases (CASEID, DETAILS) VALUES
(1, 'A'),
(2, 'B'),
(3, 'C');

declare @activities table (ACTIVITYID int, CASEID int, CODE varchar(3), STARTDATE date)
INSERT INTO @activities (ACTIVITYID, CASEID, CODE, STARTDATE) VALUES
(1, 1, '00', '2018-01-01'),
(2, 1, '110', '2018-02-01'),
(3, 1, '900', '2018-03-01'),
(4, 1, '910', '2018-05-01'),
(5, 1, '920', '2018-04-01'),
(6, 2, '900', '2018-01-01'),
(7, 2, '110', '2018-02-01'),
(8, 2, '900', '2018-03-01'),
(9, 3, '00', '2018-01-01'),
(10, 3, '123', '2018-02-01');

select
    c.CASEID,
    c.DETAILS,
    MAX(CASE WHEN rn=1 THEN CODE END) as Code1st,
    MAX(CASE WHEN rn=1 THEN STARTDATE END) as Start1st,
    MAX(CASE WHEN rn=2 THEN CODE END) as Code2nd,
    MAX(CASE WHEN rn=2 THEN STARTDATE END) as Start2nd
from
    @cases c
        left join
    (select *,ROW_NUMBER() OVER (PARTITION BY CASEID ORDER BY STARTDATE) rn
     from @activities
     where CODE BETWEEN 900 and 999) a
        on
            c.CASEID = a.CASEID and
            a.rn <= 10
group by c.CASEID,c.DETAILS

我已经展示了将上面的第一对配对。希望您能看到它如何扩展到其余的8。

答案 1 :(得分:1)

鉴于数据量,我可以使用apply

select c.*, a.*
from cases c outer apply
     (select max(case when seqnum = 1 then code end) as code_1,
             max(case when seqnum = 1 then startdate end) as startdate_1,
             max(case when seqnum = 2 then code end) as code_2,
             max(case when seqnum = 2 then startdate end) as startdate_2,
             . . .
      from (select top (10) a.*,
                   row_number() over (partition by a.caseid order by a.startdate) as seqnum
            from activities a
            where a.caseid = c.caseid and
                  a.code between 900 and 999
           ) a
      ) a;

这应该比使用pivotgroup by的解决方案具有更好的性能,因为不需要汇总来自cases的数据。聚合根据需要一次进行十行。