我有以下数据。
emp.json
{id:1, name:'abc', deptid:10}
{id:2, name:'def', deptid:20}
{id:3, name:'ghi', deptid:10}
{id:4, name:'jkm', deptid:20}
dept.json
{dept_id:10, dept_name:'PIG'}
{dept_id:20, dept_name:'JSON'}
我有以下脚本。
emp_data = LOAD '/user/JsonExample/emp.json' USING JsonLoader('id:int,name:chararray, deptid:int');
dept_data = LOAD '/user/JsonExample/dept.json' USING JsonLoader('dept_id:int,dept_name:chararray');
emp_data = FOREACH emp_data GENERATE id,name as name,deptid;
dept_data = FOREACH dept_data GENERATE dept_id,dept_name;
joined_data = JOIN emp_data by (deptid), dept_data by (dept_id);
joined_data = FOREACH joined_data GENERATE id,name,deptid,dept_name;
STORE joined_data INTO 'join_output.json' USING JsonStorage();
我得到了以下输出。
{emp_data::id:1, emp_data::name:'abc',emp_data::dept_id:10, dept_data::dept_name:'PIG'}
{emp_data::id:2, emp_data::name:'def',emp_data::dept_id:20, dept_data::dept_name:'JSON'}
{emp_data::id:3, emp_data::name:'ghi',emp_data::dept_id:10, dept_data::dept_name:'PIG'}
{emp_data::id:4, emp_data::name:'jkm',emp_data::dept_id:20, dept_data::dept_name:'JSON'}
但我想要关注输出。
{id:1, name:'abc',dept_id:10, dept_name:'PIG'}
{id:2, name:'def',dept_id:20, dept_name:'JSON'}
{id:3, name:'ghi',dept_id:10, dept_name:'PIG'}
{id:4, name:'jkm',dept_id:20, dept_name:'JSON'}
请告诉我如何获得所需的输出。
先谢谢。
答案 0 :(得分:1)
这应该有效:
joined_data = FOREACH joined_data GENERATE
emp_data::id as id,
emp_data::name as name,
emp_data::deptid as deptid,
dept_data::dept_name as dept_name;