如何在hive查询中编写case和group

时间:2016-05-25 14:56:12

标签: hadoop hive hiveql

这是我的蜂巢表:

course   dept    subject   status

btech     cse     java     pass
btech     cse     hadoop   fail
btech     cse     cg       detained
btech     cse     cc       pass
btech      it     daa      pass
btech      it     wt       pass
btech      it     cnn      pass
mba        hr     hrlaw    pass
mba        hr     hrguid   absent
mtech      cs     java     pass
mtech      cs     cd       pass
mtech      cs     cp       detained

我想查询此表以按以下方式检索数据:

course   dept    status

btech     cse     fail
btech      it     pass
mba        hr     absent
mtech      cs     fail

首先,它将检查"失败"或者"被拘留"在每个status的{​​{1}}和dept组合在一起。如果发现"失败"或者"被拘留",它将输出"失败"作为course。否则,如果"缺席"在同一组中,它将输出"缺席"作为status。否则,它将输出"传递"。

运行以下查询时收到错误消息:

status

4 个答案:

答案 0 :(得分:11)

当您按课程和部门分组时,您将获得状态列的多个值(为不同的记录提交),这需要处理。
选择中不属于group by的任何列应该是在一个集合函数中 这是一个使用sum()函数的解决方案。

select course, dept,
    case when sum(case when status in ( 'fail','detained') then 1 else 0 end) > 0 then 'fail'
         when sum(case when status in ('absent') then 1 else 0 end) > 0 then 'absent'
         when sum(case when status in ('pass') then 1 else 0 end) > 0 then 'pass'
         else 'no_result'
    end as final_status
from college
group by 
    course,dept

答案 1 :(得分:3)

如果我理解正确,你需要这样的东西:

select course,dept,
case 
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null 
end as Final_Status
from college
group by course,dept, 
   CASE when status in ( 'fail','detained') then 'FAILED'
   when status in ( 'absent') then 'absent'
   when status in ( 'pass') then 'PASSED'
   else null END;

我在GROUP中使用CASE,它可以与Hive一起使用。

答案 2 :(得分:1)

试试这个。

select course,dept,
collect_set(
case 
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null 
end ) as Final_Status
from college
group by course,dept;

答案 3 :(得分:0)

问题是,group by所需的列必须在最后。在修改后的查询下面,它现在应该可以正常工作。

select 
case 
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null 
end as Final_Status,course,dept
from college
group by course,dept;