我在猪身上有两个数据对象。
_1:
col_a: chararray,
col_b: int,
col_c: int,
col_d: chararray
_2:
col_a: chararray,
col_b: chararray,
col_c: int,
col_d: int,
col_e: int
我想加入其中两个,我试过了:
all_data = JOIN data_1 BY (col_a) LEFT, data_2 by (col_b);
all_data = JOIN data_1 BY (col_a), data_2 by (col_b);
当我尝试转储对象时(将其限制为10条记录后)两个选项都给出了同样的错误:
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: all_data_limit: Limit - scope-6383 Operator Key: scope-6383): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: all_data: New For Each(true,true)[tuple] - scope-6382 Operator Key: scope-6382): org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.lang.ClassCastException: org.apache.pig.impl.io.NullableText cannot be cast to org.apache.pig.impl.io.NullableBytesWritable
我有点沮丧,无法找到解决方案,我现在正在寻找一个3天... 任何帮助都会很棒。 谢谢!
答案 0 :(得分:1)
使用以下命令
all_data = JOIN data_1 BY TRIM(col_a) LEFT, data_2 by TRIM(col_b);
all_data = JOIN data_1 BY TRIM(col_a), data_2 by TRIM(col_b);
让我知道它是否正常运行。