对于下面的原始数据,如何获得整个行的每个customer_id的最大数量,并为该行的其余部分获取null?我可以获得数据的最大值但不能以#Results
的形式获取#Raw data
customer_id name location itemno_1 itemno_2 itemno_3 itemno_4 itemno_5
123 Ashley M CA 10 null 10 null null
123 Ashley M CA null 12 null 12 null
143 Donald P FL 15 15 0 1 10
187 Alicia P GA 15 9 null null null
1736 Mike H CT null 8 8 9 null
1736 Mike H CT null null null null null
1876 David M CA null null null null null
532 Matthew T CA null 9 10 10 null
customer_id name location itemno_1 itemno_2 itemno_3 itemno_4 itemno_5
123 Ashley M CA null 12 null null null
143 Donald P FL 15 null null null null
187 Alicia P GA 15 null null null null
1736 Mike H CT null null null null null
1876 David M CA null null null null null
532 Matthew T CA null null null 10 null
答案 0 :(得分:1)
以下是产生预期结果的查询。(我已经测试过它)我假设如果2个item_nos具有相同的最大值,我们将保持最低item_no的值。例如,对于customer_id = 123 itemno_2和itemno_4的值为12,但将itemno_2保留为12,并将itemno_4设为null。
select customer_id, name, location1
,CASE WHEN (i1 >= i2 or i2 is null)
AND (i1 >= i3 or i3 is null)
AND (i1 >= i4 or i4 is null)
AND (i1 >= i5 or i5 is null)
THEN i1
ELSE null
END as itemno_1
,CASE WHEN (i2 >= i1 or i1 is null)
AND (i2 >= i3 or i3 is null)
AND (i2 >= i4 or i4 is null)
AND (i2 >= i5 or i5 is null)
AND (i1 <> i2 or i1 is null)
THEN i2
ELSE null
END as itemno_2
,CASE WHEN (i3 >= i1 or i1 is null)
AND (i3 >= i2 or i2 is null)
AND (i3 >= i4 or i4 is null)
AND (i3 >= i5 or i5 is null)
AND (i1 <> i3 or i1 is null)
AND (i2 <> i3 or i2 is null)
THEN i3
ELSE null
END as itemno_3
,CASE WHEN (i4 >= i1 or i1 is null)
AND (i4 >= i2 or i2 is null)
AND (i4 >= i3 or i3 is null)
AND (i4 >= i5 or i5 is null)
AND (i1 <> i4 or i1 is null)
AND (i2 <> i4 or i2 is null)
and (i3 <> i4 or i3 is null)
THEN i4
ELSE null
END as itemno_4
,CASE WHEN (i5 >= i1 or i1 is null)
AND (i5 >= i2 or i2 is null)
AND (i5 >= i3 or i3 is null)
AND (i5 >= i4 or i4 is null)
AND (i1 is null or i1 <> i5)
AND (i2 is null or i2 <> i5)
AND (i3 is null or i3 <> i5)
AND (i4 is null or i4 <> i5)
THEN i5
ELSE null
END as itemno_5
from (
select customer_id, name, location1
,max(itemno_1) as i1
,max(itemno_2) as i2
,max(itemno_3) as i3
,max(itemno_4) as i4
,max(itemno_5) as i5
from default.stack2
group by customer_id, name, location1) a
order by customer_id;
同样的事情也可以通过编写UDF而不是case语句来找到最多5列并按预期返回。