这是我的蜂巢表:
course dept subject status
btech cse java pass
btech cse hadoop fail
btech cse cg detained
btech cse cc pass
btech it daa pass
btech it wt pass
btech it cnn pass
mba hr hrlaw pass
mba hr hrguid absent
mtech cs java pass
mtech cs cd pass
mtech cs cp detained
我想查询此表以按以下方式检索数据:
course dept status
btech cse fail
btech it pass
mba hr absent
mtech cs fail
首先,它将检查"失败"或者"被拘留"在每个status
的{{1}}和dept
组合在一起。如果发现"失败"或者"被拘留",它将输出"失败"作为course
。否则,如果"缺席"在同一组中,它将输出"缺席"作为status
。否则,它将输出"传递"。
运行以下查询时收到错误消息:
status
答案 0 :(得分:11)
当您按课程和部门分组时,您将获得状态列的多个值(为不同的记录提交),这需要处理。
选择中不属于group by的任何列应该是在一个集合函数中
这是一个使用sum()函数的解决方案。
select course, dept,
case when sum(case when status in ( 'fail','detained') then 1 else 0 end) > 0 then 'fail'
when sum(case when status in ('absent') then 1 else 0 end) > 0 then 'absent'
when sum(case when status in ('pass') then 1 else 0 end) > 0 then 'pass'
else 'no_result'
end as final_status
from college
group by
course,dept
答案 1 :(得分:3)
如果我理解正确,你需要这样的东西:
select course,dept,
case
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null
end as Final_Status
from college
group by course,dept,
CASE when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null END;
我在GROUP中使用CASE,它可以与Hive一起使用。
答案 2 :(得分:1)
试试这个。
select course,dept,
collect_set(
case
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null
end ) as Final_Status
from college
group by course,dept;
答案 3 :(得分:0)
问题是,group by所需的列必须在最后。在修改后的查询下面,它现在应该可以正常工作。
select
case
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null
end as Final_Status,course,dept
from college
group by course,dept;