有一个包含大量列文件的大文件,我正在加载
A = LOAD '/path/to/file' USING PigStorage(',');
B = FOREACH A GENERATE $0 AS name, $1 as address, $2.. ;
C = FOREACH B FILTER BY (name is NOT NULL);
我收到投影字段[名称]不存在的错误?我不想通过做$ 0,$ 1以及所有这些来解决列。我怎样才能给他们一些标识符?
答案 0 :(得分:1)
那个猪脚本不适合我 - 但改为:
A = LOAD '/path/to/file' USING PigStorage(',');
B = FOREACH A GENERATE $0 AS name, $1 as address, $2 as another;
C = FILTER B BY (name is NOT NULL);
确实有效。
答案 1 :(得分:0)
嵌套的FOREACH将是更好的选择
B=FOREACH A {
filtered_rec = FILTER A BY (name is not null);
GENERATE filtered_rec;
}