将行转换为列 - 按Main_IDs聚类

时间:2018-02-02 19:37:06

标签: sql-server tsql transpose horizontallist

我试图想出一种方法将我的数据转换为按特定群集分组的行。我已经运行了一个垂直显示数据的查询,但我想知道如何转置它。

以下是我的查询后的数据(我将其放入临时表)的方式:

App  Old_Status_ID  New_Status_ID  Status_Change_Date  UserID
 A         1             2           2015_01_01         22
 A         2             3           2015_02_01         20
 A         3             4           2015_03_20         51
 B         1             2           2015_01_25         84
 B         2             3           2015_02_11         22
 C         1             2           2015_01_02         35
 C         2             3           2015_03_10         01
 C         3             4           2015_04_05         55
 ....

上述表格包含数百种不同的应用程序,7种不同的状态和数百个用户。 我想要做的是在一行中显示应用程序中的所有更改。另外,我想要包括状态变化之间的经过时间差异(ΔStatus_Change_Date)=ΔSCD。

以下是数据表的一个示例:

App Status1A Status1B User1 ΔSCP_1 Status_2A Status_2B User2 ΔSCP_2 ...
 A     1        2      22     0      2          3       20    31    ...
 B     1        2      84     0      2          3       22    17    ...

不幸的是,并非所有内容都符合此行,但我希望您能够通过示例了解概念和目标。

如何转置或编写查询以实现一个应用程序中的关联数据在一行中?

我真的很感谢你的帮助!!!

以下是一些示例数据:

    +-------+-------------+-------------+------------------+--------+
|  app  | OldStatusId | NewStatusId | StatusChangeDate | userid |
+-------+-------------+-------------+------------------+--------+
| 16195 |           1 |          32 | 2017-10-03       |   2137 |
| 16195 |          32 |          32 | 2017-10-03       |   2137 |
| 16195 |          32 |           8 | 2018-01-10       |   6539 |
| 16195 |           8 |           2 | 2018-01-12       |   3452 |
| 16505 |           1 |           1 | 2017-04-26       |   3551 |
| 16505 |           1 |          32 | 2017-05-24       |   2063 |
| 16505 |          32 |          32 | 2017-05-24       |   2063 |
| 16505 |           1 |           1 | 2017-06-23       |   3551 |
| 16505 |          32 |           4 | 2017-06-23       |   5291 |
| 16505 |           4 |          32 | 2017-06-26       |   2063 |
| 16505 |          32 |           8 | 2017-06-26       |   5291 |
| 16505 |           2 |           2 | 2017-06-28       |   3438 |
| 16505 |           8 |           2 | 2017-06-28       |   3438 |
| 16505 |           1 |          32 | 2017-08-28       |   2063 |
| 16505 |          32 |           4 | 2017-10-03       |   5291 |
| 16505 |           4 |          32 | 2017-10-04       |   2063 |
| 16505 |           2 |           2 | 2017-10-25       |   3438 |
| 16505 |           8 |           2 | 2017-10-25       |   3438 |
| 16505 |          32 |           8 | 2017-10-25       |   5291 |
| 16515 |           1 |          32 | 2017-06-01       |   2456 |
| 16515 |          32 |          32 | 2017-06-01       |   2456 |
| 16515 |           4 |           4 | 2017-07-25       |   5291 |
| 16515 |          32 |           4 | 2017-07-25       |   5291 |
| 16515 |           4 |          32 | 2017-07-27       |   2456 |
| 16515 |          32 |           4 | 2017-08-09       |   5291 |
| 16515 |           4 |          32 | 2017-08-10       |   2456 |
| 16515 |          32 |           8 | 2017-08-24       |   5291 |
| 16515 |           2 |           2 | 2017-08-28       |   3438 |
| 16515 |           8 |           2 | 2017-08-28       |   3438 |
| 16515 |           1 |          32 | 2017-10-06       |   2456 |
| 16515 |          32 |          32 | 2017-10-06       |   2456 |
| 16515 |           1 |           1 | 2017-10-17       |   2456 |
| 16515 |          32 |         128 | 2017-11-20       |   5291 |
| 16515 |          32 |           8 | 2017-11-29       |   5291 |
| 16515 |         128 |          32 | 2017-11-29       |   5291 |
| 16515 |           8 |           2 | 2017-12-07       |   3611 |
+-------+-------------+-------------+------------------+--------+

3 个答案:

答案 0 :(得分:0)

Using PIVOT

您可以使用PIVOT和UNPIVOT关系运算符将表值表达式更改为另一个表。 PIVOT通过将表达式中的一列中的唯一值转换为输出中的多个列来旋转表值表达式,并执行聚合,在最终输出中需要的任何剩余列值上需要它们

答案 1 :(得分:0)

我会将订购问题留给您。正如我之前所说,如果您有两行具有相同的日期,则无法知道哪一行将首先列出,因为您无法通过任何方式对数据执行此操作。你需要的是一些非常丑陋的动态sql来生成所有这些列。在这段代码中,我将使用一个计数表。在我的系统中,我将此视为一种观点。这是我的计数表的代码。

create View [dbo].[cteTally] as

WITH
    E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
    cteTally(N) AS 
    (
        SELECT  ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
    )
select N from cteTally
GO

现在我们需要利用动态sql和这个计数表。我们将为我们构建sql sql。这样的事情。

if OBJECT_ID('tempdb..#Something') is not null
    drop table #Something

create table #Something
(
    app int
    , OldStatusId int
    , NewStatusId int
    , StatusChangeDate date 
    , userid int
)

insert #Something 
(
    app
    , OldStatusId
    , NewStatusId
    , StatusChangeDate
    , userid
) VALUES

(16195, 1, 32, '2017-10-03', 2137)
, (16195, 32, 32, '2017-10-03', 2137)
, (16195, 32, 8, '2018-01-10', 6539)
, (16195, 8, 2, '2018-01-12', 3452)
, (16505, 1, 1, '2017-04-26', 3551)
, (16505, 1, 32, '2017-05-24', 2063)
, (16505, 32, 32, '2017-05-24', 2063)
, (16505, 1, 1, '2017-06-23', 3551)
, (16505, 32, 4, '2017-06-23', 5291)
, (16505, 4, 32, '2017-06-26', 2063)
, (16505, 32, 8, '2017-06-26', 5291)
, (16505, 2, 2, '2017-06-28', 3438)
, (16505, 8, 2, '2017-06-28', 3438)
, (16505, 1, 32, '2017-08-28', 2063)
, (16505, 32, 4, '2017-10-03', 5291)
, (16505, 4, 32, '2017-10-04', 2063)
, (16505, 2, 2, '2017-10-25', 3438)
, (16505, 8, 2, '2017-10-25', 3438)
, (16505, 32, 8, '2017-10-25', 5291)
, (16515, 1, 32, '2017-06-01', 2456)
, (16515, 32, 32, '2017-06-01', 2456)
, (16515, 4, 4, '2017-07-25', 5291)
, (16515, 32, 4, '2017-07-25', 5291)
, (16515, 4, 32, '2017-07-27', 2456)
, (16515, 32, 4, '2017-08-09', 5291)
, (16515, 4, 32, '2017-08-10', 2456)
, (16515, 32, 8, '2017-08-24', 5291)
, (16515, 2, 2, '2017-08-28', 3438)
, (16515, 8, 2, '2017-08-28', 3438)
, (16515, 1, 32, '2017-10-06', 2456)
, (16515, 32, 32, '2017-10-06', 2456)
, (16515, 1, 1, '2017-10-17', 2456)
, (16515, 32, 28, '2017-11-20', 5291)
, (16515, 32, 8, '2017-11-29', 5291)
, (16515, 128, 32, '2017-11-29', 5291)
, (16515, 8, 2, '2017-12-07', 3611)

declare @StaticPortion nvarchar(2000) = 
    'with OrderedResults as
    (
        select *, ROW_NUMBER() over(partition by app order by StatusChangeDate) as RowNum
        from #Something
    )
    select app';

declare @DynamicPortion nvarchar(max) = '';

select @DynamicPortion = @DynamicPortion + 
    ', MAX(Case when RowNum = ' + CAST(N as varchar(6)) + ' then OldStatusId end) as OldStatus' + CAST(N as varchar(6)) + CHAR(10)
    + ', MAX(Case when RowNum = ' + CAST(N as varchar(6)) + ' then NewStatusId end) as NewStatus' + CAST(N as varchar(6)) + CHAR(10)
    + ', MAX(Case when RowNum = ' + CAST(N as varchar(6)) + ' then StatusChangeDate end) as StatusChangeDate' + CAST(N as varchar(6)) + CHAR(10)
    + ', MAX(Case when RowNum = ' + CAST(N as varchar(6)) + ' then userid end) as userid' + CAST(N as varchar(6)) + CHAR(10)
from cteTally t
where t.N <= 
(
    select top 1 Count(*)
    from #Something
    group by app
    order by COUNT(*) desc
)

declare @FinalStaticPortion nvarchar(2000) = ' from OrderedResults Group by app order by app';
declare @SqlToExecute nvarchar(max) = @StaticPortion + @DynamicPortion + @FinalStaticPortion;
exec sp_executesql @SqlToExecute

我在这里没有展示的唯一部分是从行到行的变化。您可以使用临时表执行此操作。但是你必须使用全局临时表,因为列是动态生成的,临时表的范围不允许我们在执行动态查询后查看它。一旦你理解了这段代码的作用,你就应该能够自己添加最后一部分了。但是如果你回来后卡住我们会看到我们能做什么。

答案 2 :(得分:0)

您只需使用 STRING_AGG 功能即可实现目标。

示例数据:

create table clean (
  app VARCHAR(50),
  old_status INT,
  new_status INT,
  start_date DATETIME,
  end_date DATETIME,
  user_id INT
);

INSERT INTO CLEAN VALUES('14595', 2, 2, '9/12/2017 16:14:33', '11/1/2017 15:37:58', 3470);
INSERT INTO CLEAN VALUES('14595', 1, 2, '9/12/2017 16:14:33', '11/1/2017 15:37:58', 3470);
INSERT INTO CLEAN VALUES('14595', 2, 64, '11/1/2017 15:21:49', '11/1/2017 15:37:58', 3470);
INSERT INTO CLEAN VALUES('14595', 2, 2, '11/1/2017 15:37:58', NULL, 3470);
INSERT INTO CLEAN VALUES('14595', 64, 2, '11/1/2017 15:37:58', NULL, 3470);
INSERT INTO CLEAN VALUES('14595', 32, 8, '9/27/2017 10:19:48', '1/26/2018 10:50:18', 5291);
INSERT INTO CLEAN VALUES('14595', 32, 8, '1/26/2018 10:50:18', NULL, 5291);
INSERT INTO CLEAN VALUES('14595', 1, 32, '9/13/2017 15:18:24', NULL, 5297);
INSERT INTO CLEAN VALUES('14595', 1, 1, '7/14/2017 14:29:51', '1/19/2018 14:15:13', 5327);
INSERT INTO CLEAN VALUES('14595', 1, 32, '1/19/2018 14:15:13', NULL, 5327);
INSERT INTO CLEAN VALUES('14595', 2, 2, '9/27/2017 10:40:06', '1/26/2018 10:52:54', 6509);
INSERT INTO CLEAN VALUES('14595', 8, 2, '9/27/2017 10:40:06', '1/26/2018 10:52:54', 6509);
INSERT INTO CLEAN VALUES('14595', 8, 2, '1/26/2018 10:52:54', NULL, 6509);
INSERT INTO CLEAN VALUES('14596', 32, 4, '10/9/2017 14:28:10', '12/14/2017 14:45:59', 5290);
INSERT INTO CLEAN VALUES('14596', 32, 4, '10/11/2017 11:57:05', '12/14/2017 14:45:59', 5290);
INSERT INTO CLEAN VALUES('14596', 8, 8, '10/11/2017 15:02:23', '12/14/2017 14:45:59', 5290);
INSERT INTO CLEAN VALUES('14596', 32, 8, '10/11/2017 15:02:23', '12/14/2017 14:45:59', 5290);
INSERT INTO CLEAN VALUES('14596', 32, 4, '12/13/2017 10:51:30', '12/14/2017 14:45:59', 5290);
INSERT INTO CLEAN VALUES('14596', 32, 8, '12/14/2017 14:45:59', NULL, 5290);
INSERT INTO CLEAN VALUES('14596', 1, 1, '8/11/2017 12:17:49', '1/12/2018 16:06:16', 5298);
INSERT INTO CLEAN VALUES('14596', 1, 32, '9/19/2017 16:00:36', '1/12/2018 16:06:16', 5298);
INSERT INTO CLEAN VALUES('14596', 4, 32, '10/9/2017 15:45:59', '1/12/2018 16:06:16', 5298);
INSERT INTO CLEAN VALUES('14596', 4, 32, '10/11/2017 12:43:21', '1/12/2018 16:06:16', 5298);
INSERT INTO CLEAN VALUES('14596', 1, 32, '11/9/2017 16:05:44', '1/12/2018 16:06:16', 5298);
INSERT INTO CLEAN VALUES('14596', 32, 32, '11/9/2017 16:05:44', '1/12/2018 16:06:16', 5298);
INSERT INTO CLEAN VALUES('14596', 4, 32, '12/14/2017 10:38:19', '1/12/2018 16:06:16', 5298);
INSERT INTO CLEAN VALUES('14596', 1, 32, '1/12/2018 16:06:16', NULL, 5298);
INSERT INTO CLEAN VALUES('14596', 8, 2, '12/13/2017 13:36:56', '1/4/2018 16:47:43', 6506);
INSERT INTO CLEAN VALUES('14596', 8, 2, '1/4/2018 16:47:43', NULL, 6506);
INSERT INTO CLEAN VALUES('15980', 8, 2, '1/18/2018 16:11:46', '1/19/2018 10:27:44', 3441);
INSERT INTO CLEAN VALUES('15980', 8, 2, '1/19/2018 10:27:44', NULL, 3441);
INSERT INTO CLEAN VALUES('15980', 32, 8, '1/17/2018 11:11:40', '1/18/2018 10:22:32', 5290);
INSERT INTO CLEAN VALUES('15980', 32, 128, '1/17/2018 15:54:36', '1/18/2018 10:22:32', 5290);
INSERT INTO CLEAN VALUES('15980', 128, 32, '1/18/2018 10:22:28', '1/18/2018 10:22:32', 5290);
INSERT INTO CLEAN VALUES('15980', 32, 8, '1/18/2018 10:22:32', NULL, 5290);
INSERT INTO CLEAN VALUES('15980', 1, 1, '10/1/2017 21:54:45', '12/27/2017 0:11:12', 5467);
INSERT INTO CLEAN VALUES('15980', 1, 32, '12/27/2017 0:00:18', '12/27/2017 0:11:12', 5467);
INSERT INTO CLEAN VALUES('15980', 1, 32, '12/27/2017 0:11:12', NULL, 5467);
INSERT INTO CLEAN VALUES('15998', 1, 32, '6/1/2017 13:32:49', '12/12/2017 12:52:16', 2456);
INSERT INTO CLEAN VALUES('15998', 32, 32, '6/1/2017 13:32:49', '12/12/2017 12:52:16', 2456);
INSERT INTO CLEAN VALUES('15998', 4, 32, '7/24/2017 9:51:27', '12/12/2017 12:52:16', 2456);
INSERT INTO CLEAN VALUES('15998', 4, 32, '7/27/2017 13:26:39', '12/12/2017 12:52:16', 2456);
INSERT INTO CLEAN VALUES('15998', 4, 32, '8/10/2017 13:19:22', '12/12/2017 12:52:16', 2456);
INSERT INTO CLEAN VALUES('15998', 1, 32, '10/6/2017 13:43:21', '12/12/2017 12:52:16', 2456);
INSERT INTO CLEAN VALUES('15998', 32, 32, '10/6/2017 13:43:21', '12/12/2017 12:52:16', 2456);
INSERT INTO CLEAN VALUES('15998', 1, 1, '10/17/2017 12:51:12', '12/12/2017 12:52:16', 2456);
INSERT INTO CLEAN VALUES('15998', 1, 32, '12/12/2017 12:52:16', NULL, 2456);
INSERT INTO CLEAN VALUES('15998', 8, 2, '8/18/2017 13:26:22', NULL, 3438);
INSERT INTO CLEAN VALUES('15998', 2, 2, '8/18/2017 13:26:22', NULL, 3438);
INSERT INTO CLEAN VALUES('15998', 2, 2, '11/10/2017 13:15:40', NULL, 3611);
INSERT INTO CLEAN VALUES('15998', 8, 2, '11/10/2017 13:15:40', NULL, 3611);
INSERT INTO CLEAN VALUES('15998', 4, 4, '7/21/2017 11:19:39', '11/10/2017 12:20:50', 5291);
INSERT INTO CLEAN VALUES('15998', 32, 4, '7/21/2017 11:19:39', '11/10/2017 12:20:50', 5291);
INSERT INTO CLEAN VALUES('15998', 32, 4, '7/25/2017 13:15:59', '11/10/2017 12:20:50', 5291);
INSERT INTO CLEAN VALUES('15998', 4, 4, '7/25/2017 13:15:59', '11/10/2017 12:20:50', 5291);
INSERT INTO CLEAN VALUES('15998', 32, 4, '8/9/2017 16:36:43', '11/10/2017 12:20:50', 5291);
INSERT INTO CLEAN VALUES('15998', 32, 8, '8/10/2017 13:46:16', '11/10/2017 12:20:50', 5291);
INSERT INTO CLEAN VALUES('15998', 32, 128, '11/7/2017 16:42:24', '11/10/2017 12:20:50', 5291);
INSERT INTO CLEAN VALUES('15998', 128, 32, '11/10/2017 12:20:43', '11/10/2017 12:20:50', 5291);
INSERT INTO CLEAN VALUES('15998', 32, 8, '11/10/2017 12:20:50', NULL, 5291);

查询:

with a as (
    select app, user_id,
    old_status, new_status, 
    start_date, end_date, datediff(day, start_date, end_date) as delta 
    from clean
), b as (
    select app, user_id,
    old_status, new_status, 
    start_date, end_date, ISNULL(delta, 0) as delta
    from a
    where old_status != new_status
), c as (
    select app, user_id,
    concat('[', old_status, '-', new_status, ' ', delta, ' days]') as column_2
    from b
), d as (
    select c.app, concat('{USER: ', c.user_id, ' ', STRING_AGG(c.column_2, ' | '), '}') as concat
    from c
    group by c.app, c.user_id
)
select d.app, STRING_AGG(d.concat, '; ') as user_activity from d
group by d.app
order by d.app;

结果:

+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  app  | user_activity                                                                                                                                                                                                                                                                                                    |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 14595 | {USER: 3470 [1-2 50 days] | [2-64 0 days] | [64-2 0 days]}; {USER: 5291 [32-8 121 days] | [32-8 0 days]}; {USER: 5297 [1-32 0 days]}; {USER: 5327 [1-32 0 days]}; {USER: 6509 [8-2 121 days] | [8-2 0 days]}
| 14596 | {USER: 5290 [32-4 66 days] | [32-4 64 days] | [32-8 64 days] | [32-4 1 days] | [32-8 0 days]}; {USER: 5298 [1-32 115 days] | [4-32 95 days] | [4-32 93 days] | [1-32 64 days] | [4-32 29 days] | [1-32 0 days]}; {USER: 6506 [8-2 22 days] | [8-2 0 days]}
| 15980 | {USER: 3441 [8-2 1 days] | [8-2 0 days]}; {USER: 5290 [32-8 1 days] | [32-128 1 days] | [128-32 0 days] | [32-8 0 days]}; {USER: 5467 [1-32 0 days] | [1-32 0 days]}
| 15998 | {USER: 2456 [1-32 194 days] | [4-32 141 days] | [4-32 138 days] | [4-32 124 days] | [1-32 67 days] | [1-32 0 days]}; {USER: 3438 [8-2 0 days]}; {USER: 3611 [8-2 0 days]}; {USER: 5291 [32-4 112 days] | [32-4 108 days] | [32-4 93 days] | [32-8 92 days] | [32-128 3 days] | [128-32 0 days] | [32-8 0 days]}

如果更改顺序和用户顺序很重要,那么第二个解决方案是 WITHIN GROUP 子句:

with a as (
    select app, user_id,
    old_status, new_status, 
    start_date, end_date, datediff(day, start_date, end_date) as delta 
    from clean
), b as (
    select app, user_id,
    old_status, new_status, 
    start_date, end_date, ISNULL(delta, 0) as delta
    from a
    where old_status != new_status
), c as (
    select app, user_id, start_date,
    concat('[', old_status, '-', new_status, ' ', delta, ' days]') as column_2
    from b
), d as (
    select c.app, concat('{USER: ', c.user_id, ' ', STRING_AGG(c.column_2, ' | ') WITHIN GROUP (ORDER BY c.start_date ASC), '}') as concat
    from c
    group by c.app, c.user_id
)
select d.app, STRING_AGG(d.concat, '; ') WITHIN GROUP (ORDER BY d.concat ASC) as user_activity from d
group by d.app
order by d.app;

结果:

+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  app  | user_activity                                                                                                                                                                                                                                                                                                    |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 14595 | {USER: 3470 [1-2 50 days] | [2-64 0 days] | [64-2 0 days]}; {USER: 5291 [32-8 121 days] | [32-8 0 days]}; {USER: 5297 [1-32 0 days]}; {USER: 5327 [1-32 0 days]}; {USER: 6509 [8-2 121 days] | [8-2 0 days]}
| 14596 | {USER: 5290 [32-4 66 days] | [32-4 64 days] | [32-8 64 days] | [32-4 1 days] | [32-8 0 days]}; {USER: 5298 [1-32 115 days] | [4-32 95 days] | [4-32 93 days] | [1-32 64 days] | [4-32 29 days] | [1-32 0 days]}; {USER: 6506 [8-2 22 days] | [8-2 0 days]}
| 15980 | {USER: 3441 [8-2 1 days] | [8-2 0 days]}; {USER: 5290 [32-8 1 days] | [32-128 1 days] | [128-32 0 days] | [32-8 0 days]}; {USER: 5467 [1-32 0 days] | [1-32 0 days]}
| 15998 | {USER: 2456 [1-32 194 days] | [4-32 141 days] | [4-32 138 days] | [4-32 124 days] | [1-32 67 days] | [1-32 0 days]}; {USER: 3438 [8-2 0 days]}; {USER: 3611 [8-2 0 days]}; {USER: 5291 [32-4 112 days] | [32-4 108 days] | [32-4 93 days] | [32-8 92 days] | [32-128 3 days] | [128-32 0 days] | [32-8 0 days]}