检查袋子是否在猪的foreach内是否为空

时间:2013-10-14 11:51:49

标签: hadoop apache-pig

我正在加入3个表格并且在foreach内部我需要检查天气ReadStagingData包是否为空。 以下是代码

ReadStagingData = Load 'Staging_data.csv' Using PigStorage(',') As     (PL_Posn_id:int,Brok_org_dly:double,Brok_org_ptd:double);

ReadPriorData = Load 'ptd.csv' Using PigStorage(',') As (PL_Posn_id:int,Brok_org_ptd:double);

ReadPriorFunctional = Load 'Functional.csv' Using PigStorage(',') AS (PL_Posn_id:int,Brok_fun_ptd:double,Brok_fun_ltd:double);

JoinDS1 = JOIN ReadPriorData BY PL_Posn_id,ReadPriorFunctional BY PL_Posn_id;

JoinDS2 = JOIN ReadStagingData by PL_Posn_id Left OUTER,JoinDS1 BY      ReadPriorData::PL_Posn_id;

X = Foreach JoinDS2 {
    **test = (NOT(IsEmpty(ReadStagingData))); //Error on this line**
    GENERATE test,ReadStagingData::PL_Posn_id,
    ReadStagingData::Brok_org_dly,
   (ReadStagingData::Brok_org_ptd is not null ? ReadStagingData::Brok_org_ptd:ReadPriorData::Brok_org_ptd+ReadStagingData::Brok_org_dly);
};

Dump X;

当我运行上面的代码时,我收到错误INVALID PROJECTION ReadStagingData.Please帮助我

1 个答案:

答案 0 :(得分:0)

在您的关系中XReadStagingData不是一个包。符号ReadStagingData::Brok_org_dly并不表示从包中投射;它是一个顶级字段,以JOIN之后的方式命名,以确保每个字段的唯一名称。所以ReadStagingData只是一个前缀。

此外,我不确定您为什么要检查 - 因为您正在进行LEFT OUTER加入,X中没有相应的记录记录在ReadStagingData。如果您正在进行RIGHT OUTER加入,那将会有所不同。

如果您打算进行RIGHT OUTER加入,并且想要检查来自ReadStagingData的字段是否为NULL,我会这样做:

rsdIsNull = ReadStagingData::PL_Posn_id IS NULL;