我需要从表的多个列中获取Null值的总数

时间:2018-10-26 08:58:56

标签: sql pyspark

示例表:

color|country|value1|value2|value3
-----------------------------------
Red  | India |1     |null  |4
Blue | USA   |4     |2     |null
Red  | USA   |null  |1     |2
Blue | null  |4     |1     |1

输出:

Target | No_1 | No_2 | No_4 | No_null
value1 | 1    | 0    | 2    | 1
value2 | 2    | 1    | 0    | 1
value3 | 1    | 1    | 1    | 1

1 个答案:

答案 0 :(得分:1)

您可以尝试使用UNION ALL,然后使用CASE WHEN进行条件聚合

select 
    target, 
    count(case when val=1 then 1 end) as no_1,
    count(case when val=2 then 1 end) as no_2,
    count(case when val=4 then 1 end) as no_4,
    count(case when val is null then 1 end) as no_null
from
(
   select 'value1' as target,value1 as val from tablename
   union all
   select 'value2',value2 from tablename
   union all
   select 'value3', value3 from tablename
)X group by target