在我的表中,每一行都有一些数据列Priority
列(例如,时间戳或只是一个整数)。我想按ID对数据进行分组,然后在每个组中采用最新的非空列。例如,我有以下表格:
id A B C Priority
1 NULL 3 4 1
1 5 6 NULL 2
1 8 NULL NULL 3
2 634 346 359 1
2 34 NULL 734 2
期望的结果是:
id A B C
1 8 6 4
2 34 346 734
在这个例子中,表很小,只有5列,但在实际表格中,它会大得多。我真的希望这个脚本能够快速运行。我尝试自己做,但我的脚本适用于SQLSERVER2012 +所以我删除它不适用。
数字:表格可以有150k行,20列,20-80k的独特id
,平均SELECT COUNT(id) FROM T GROUP BY ID
是2..5
现在我有一个正常工作的代码(感谢@ypercubeᵀᴹ),但它在大表上运行得非常慢,在我的情况下脚本可能需要一分钟甚至更长时间(带索引等)。
如何加快速度?
SELECT
d.id,
d1.A,
d2.B,
d3.C
FROM
( SELECT id
FROM T
GROUP BY id
) AS d
OUTER APPLY
( SELECT TOP (1) A
FROM T
WHERE id = d.id
AND A IS NOT NULL
ORDER BY priority DESC
) AS d1
OUTER APPLY
( SELECT TOP (1) B
FROM T
WHERE id = d.id
AND B IS NOT NULL
ORDER BY priority DESC
) AS d2
OUTER APPLY
( SELECT TOP (1) C
FROM T
WHERE id = d.id
AND C IS NOT NULL
ORDER BY priority DESC
) AS d3 ;
答案 0 :(得分:4)
这应该可以解决问题,所有提升到幂0的东西都将返回1,除了null:
class
结果:
"select-device-button"
答案 1 :(得分:2)
可能更快的一种替代方案是多连接方法。获取每列的优先级,然后返回原始表。第一部分:
select id,
max(case when a is not null then priority end) as pa,
max(case when b is not null then priority end) as pb,
max(case when c is not null then priority end) as pc
from t
group by id;
然后再加入此表:
with pabc as (
select id,
max(case when a is not null then priority end) as pa,
max(case when b is not null then priority end) as pb,
max(case when c is not null then priority end) as pc
from t
group by id
)
select pabc.id, ta.a, tb.b, tc.c
from pabc left join
t ta
on pabc.id = ta.id and pabc.pa = ta.priority left join
t tb
on pabc.id = tb.id and pabc.pb = tb.priority left join
t tc
on pabc.id = tc.id and pabc.pc = tc.priority ;
这也可以利用t(id, priority)
上的索引。
答案 2 :(得分:0)
以前的代码将使用以下语法:
with pabc as (
select id,
max(case when a is not null then priority end) as pa,
max(case when b is not null then priority end) as pb,
max(case when c is not null then priority end) as pc
from t
group by id
)
select pabc.Id,ta.a, tb.b, tc.c
from pabc
left join t ta on pabc.id = ta.id and pabc.pa = ta.priority
left join t tb on pabc.id = tb.id and pabc.pb = tb.priority
left join t tc on pabc.id = tc.id and pabc.pc = tc.priority ;
答案 3 :(得分:-1)
这看起来很奇怪。您有一个用于所有列更改的日志表,但没有与当前数据关联的表。现在,您正在寻找一个查询来从日志表中收集当前值,这自然是一项艰巨的任务。
解决方案很简单:有一个包含当前数据的附加表。您甚至可以使用触发器链接表(因此 每次在日志表中插入记录时,每次将更改写入当前表时都会更新当前表或表写日志条目。)
然后只查询当前的表格:
select id, a, b, c from currenttable order by id;