我在Teradata中有一个包含6列的表,如下所示:
ID Feature1 Feature2 Feature3 Feature4 Feature5
1 12 15 1 22 350
2 121 0.9 999 756 879
...
我需要获取每行最大,第二大和第三大值的列名,因此,我需要输出如下:
ID Greatest 2nd_Greatest 3rd_Greatest
1 Feature5 Feature4 Feature2
2 Feature3 Feature5 Feature4
有人可以帮忙吗。
谢谢!
答案 0 :(得分:2)
您可以使用大量case
语句执行此操作,如果任何值为NULL
,则会更加复杂。不过,这将是最快的方式。
最简单的方法可能是取消数据并重新汇总:
select id,
max(case when seqnum = 1 then feature end) as greatest_feature,
max(case when seqnum = 2 then feature end) as greatest_feature2,
max(case when seqnum = 3 then feature end) as greatest_feature3,
max(case when seqnum = 1 then which end) as which_1,
max(case when seqnum = 2 then which end) as which_2,
max(case when seqnum = 3 then which end) as which_3
from (select id, feature, row_number() over (partition by id order by feature desc) as serqnum
from ((select id, feature1 as feature, 'feature1' as which from table) union all
(select id, feature2 as feature, 'feature2' as which from table) union all
(select id, feature3 as feature, 'feature3' as which from table) union all
(select id, feature4 as feature, 'feature4' as which from table) union all
(select id, feature5 as feature, 'feature5' as which from table) union all
(select id, feature6 as feature, 'feature6' as which from table)
) t
) t
group by id;
答案 1 :(得分:1)
提炼戈登的问题:
您可以创建一个功能列表,然后交叉加入,而不是在这些UNION的源表上进行多次传递:
SELECT t.id, f.feature,
CASE f.feature
WHEN 'feature1' THEN t.feature1
WHEN 'feature2' THEN t.feature2
WHEN 'feature3' THEN t.feature3
WHEN 'feature4' THEN t.feature4
WHEN 'feature5' THEN t.feature5
END AS val
FROM tab AS t CROSS JOIN
(
SELECT * FROM (SELECT 'feature1' AS feature) AS dt
UNION ALL
SELECT * FROM (SELECT 'feature2' AS feature) AS dt
UNION ALL
SELECT * FROM (SELECT 'feature3' AS feature) AS dt
UNION ALL
SELECT * FROM (SELECT 'feature4' AS feature) AS dt
UNION ALL
SELECT * FROM (SELECT 'feature5' AS feature) AS dt
) AS f
您可以像上面一样使用UNION或真实表创建列表。
从TD14.10开始,还有一个TD_UNPIVOT表操作符(但仍然没有PIVOT):
SELECT *
FROM TD_UNPIVOT
(
ON (SELECT id, feature1, feature2, feature3, feature4, feature5 FROM tab)
USING
VALUE_COLUMNS('val')
UNPIVOT_COLUMN('feature')
COLUMN_LIST('feature1', 'feature2', 'feature3', 'feature4', 'feature5')
) AS dt
同样从TD14.10开始,还有LAST_VALUE,可用于与ROW_NUMBER一起查找第n个最大值,从而避免最终聚合:
SELECT id,
feature AS "Greatest",
LAST_VALUE(feature)
OVER (PARTITION BY id ORDER BY val DESC
ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS "2nd_Greatest",
LAST_VALUE(feature)
OVER (PARTITION BY id ORDER BY val DESC
ROWS BETWEEN 2 FOLLOWING AND 2 FOLLOWING) AS "3rd_Greatest"
FROM TD_UNPIVOT
(
ON (SELECT id, feature1, feature2, feature3, feature4, feature5 FROM tab)
USING
VALUE_COLUMNS('val')
UNPIVOT_COLUMN('feature')
COLUMN_LIST('feature1', 'feature2', 'feature3', 'feature4', 'feature5')
) AS dt
QUALIFY ROW_NUMBER() OVER (PARTITION BY id ORDER BY val DESC) = 1;