如何在加载后处理Pig Latin中的字段

时间:2014-06-14 11:38:15

标签: apache-pig

有一个包含大量列文件的大文件,我正在加载

A = LOAD '/path/to/file' USING PigStorage(',');

B = FOREACH A GENERATE $0 AS name, $1 as address, $2.. ;
C = FOREACH B FILTER BY (name is NOT NULL);

我收到投影字段[名称]不存在的错误?我不想通过做$ 0,$ 1以及所有这些来解决列。我怎样才能给他们一些标识符?

2 个答案:

答案 0 :(得分:1)

那个猪脚本不适合我 - 但改为:

A = LOAD '/path/to/file' USING PigStorage(',');
B = FOREACH A GENERATE $0 AS name, $1 as address, $2 as another;
C = FILTER B BY (name is NOT NULL);

确实有效。

答案 1 :(得分:0)

嵌套的FOREACH将是更好的选择

B=FOREACH A {
   filtered_rec =  FILTER A BY (name is not null);
   GENERATE filtered_rec;
}