尝试在猪中加入2个文件内容
StringFile = load 'String' using PigStorage(',') as (name,branch,div); -- string values
NumFile = load 'num' using PigStorage(',') as (id,m1,m2,m3,m4); -- numeric values
joined = join id by name,(m1,m2) by branch,div by (m3,m4);
store joined into 'joinedfile' using PigStorage(',');
但显示
[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file filterjoin.pig, line 4, column 14> Syntax error, unexpected symbol at or near '('
Anju,IT,A --stringFile
1,5.3,3.6,1.6,0.3 - numFile
尝试输出
1,Anju,5.3,3.6,IT,A,1.6,0.3
我做错了吗?
从教科书
您还可以加入多个密钥。在所有情况下,你必须拥有 相同数量的密钥,它们必须是相同或兼容的类型 (兼容意味着可以插入隐式强制转换
1. It should be same number of keys?
id by name
(m1,m2) by branch
div by (m3,m4)
Is this not possible?
2. while joining, the datatype should be same?
答案 0 :(得分:1)
我认为你误解了join
的作用。它通过公共元素连接两个数据集。所以语法是:
C = join A by a1, B by b1;
其中a1和b1是各自关系的字段,它们也有注释元素。
示例:
students =
1 rob
2 john
3 fred
gpas =
1 3.2
2 3.8
3 4.0
A = join students by id, gpas by id;
A =
1 rob 1 3.2
2 john 2 3.8
3 fred 3 4.0