Hive中的高效子查询

时间:2015-07-03 21:28:30

标签: mysql hadoop hive hiveql

我有桌子 -

Employee    Dept    Visited
1   a   yes
1       yes
1       yes
2   b
1   b   yes
2       yes
3   ab
4   ac  yes
5       yes
5       yes
6   fe
6   
7   ad  yes
2   ad  yes
3   a   yes
3   c
6       yes
7   
8   a   yes
8       yes
9   fe  yes

*

  

我需要找到所有没有2个Depts值的员工   访问=是

*

我尝试在Hive中编写查询并跟随 -

select c.Employee 
from table c
where c.Employee NOT IN (select d.Employee from table d where Visited = 'Yes' and Dept = '' group by d.Employee having count(d.Employee) >=2)
;

它可以工作,但是这个查询会花费很多时间,所以我相信它会更好。 任何建议

1 个答案:

答案 0 :(得分:2)

我建议使用havinggroup by

select c.Employee
from table c
group by c.Employee
having sum(case when c.dept is null and c.visited = 'Yes' then 1 else 0 end) < 2;