我有以下内容:
(id:int,names:chararray)
我按照ID分组,创建了一个名字。我看到在名字包中,可能有一个空值。如何从包中删除空值?
答案 0 :(得分:1)
您可以使用嵌套在FOREACH中的FILTER从GROUP BY创建的包中删除元组。
inpt = LOAD '...' as (id: int, names: chararray);
grp = GROUP inpt BY id;
result = FOREACH grp {
no_nulls = FILTER inpt BY names is not null;
GENERATE group, no_nulls;
};
或者只是在分组前过滤空名称:
inpt = LOAD '...' as (id: int, names: chararray);
no_nulls = FILTER input BY names is not null;
grp = GROUP no_nulls BY id;