有人可以帮我理解如何为以下查询创建mapper和reducer。下面提到的是查询的查询和结果。 我想知道:
有没有办法改善查询的性能?
hive>
> select agegroup , sum(safe_cnt),sum(unsafe_cnt) from
> (
> select case when b.age <= 25 then 'age25' when (b.age > 25 and b.age < 35) then 'age35'
> when (b.age >= 35 and b.age < 45) then 'age45' else 'age50' end as agegroup ,
> a.safe_cnt, a.unsafe_cnt from
> (select cust_id, sum (case when place = 'safe' then 1 else 0 end) as safe_cnt, sum (case when place = 'unsafe' then 1 else 0 end) as unsafe_cnt from directive_ana.sensor_main group by cust_id) a
> left outer join directive_ana.cust_main b where a.cust_id = b.cust_id
> ) e
> group by agegroup ;
Query ID = root_xxxxx_xxxxxXxxx_xxx_xxx_xxx
Total jobs = 1
Launching Job 1 out of 1
状态:正在运行(在App ID为application_147071xxxxx_xxx的YARN群集上执行)
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 .......... SUCCEEDED 1 1 0 0 0 0
Map 4 .......... SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0
Reducer 3 ...... SUCCEEDED 1 1 0 0 0 0
--------------------------------------------------------------------------------
VERTICES: 04/04 [==========================>>] 100% ELAPSED TIME: 9.17 s
--------------------------------------------------------------------------------
OK
age25 8 7
age35 202 166
age45 194 163
age50 740 611
Time taken: 14.04 seconds, Fetched: 4 row(s)