我创建的Pig脚本有效,除非我尝试在我加入的字段上使用GENERATE
。
cc_data = LOAD 'default.complaint1' USING org.apache.hive.hcatalog.pig.HCatLoader();
cc2_data = LOAD 'default.complaint2' USING org.apache.hive.hcatalog.pig.HCatLoader();
combined = join cc_data by complaintid, cc2_data by complaintid;
如果我在我的合并上DESCRIBE
,则显示如下:
组合:
{cc_data::daterecieved: chararray,
cc_data::product: chararray,
cc_data::subproduct: chararray,
cc_data::issue: chararray,
cc_data::subissue: chararray,
cc_data::consumercomplaintnarrative: chararray,
cc_data::companypublicresponse: chararray,
cc_data::company: chararray,
cc_data::state: chararray,
cc_data::zip: chararray,
cc_data::submitted: chararray,
cc_data::datesenttocompany: chararray,
cc_data::companyresponsetoconsumer: chararray,
cc_data::timelyresponse: chararray,
cc_data::consumerdisputed: chararray,
cc_data::complaintid: int,
cc2_data::complaintid: int,
cc2_data::complaintamount: float,
cc2_data::consumerzip: int,
cc2_data::creditrating: chararray,
cc2_data::bankrupthistory: chararray}
我可以在除了抱怨字段之外的所有字段上使用FOREACH
和GENERATE
。我甚至尝试过cc_data.complaintid。我收到这个错误:
ERROR 1025:
<file pig_read_orcfile.pig, line 13, column 190> Invalid field projection. Projected field [complaintid] does not exist in schema
有什么想法吗?任何帮助将不胜感激!
答案 0 :(得分:0)
请尝试
... FOREACH combined GENERATE cc_data::complaintid;