我有一个关系J2,看起来像(Popp,{(100)})(Urman,{(100)})(Sciarra,{(100)})(陈,{(100)})(Faviet ,{(100)})(Gietz,{()})(希金斯,{()})(LAST_NAME,{()})(格兰特,{()})....我要测试是否包是空的,所以尝试:S = FILTER J2 BY IsEmpty($ 1); ..它正在成功执行,但输出为空。任何人都可以指导我这个。是否有使用IsEmpty()的先决条件?
注意:DESCRIBE J2提供" {AA :: LAST_NAME:chararray,{(int)}}"
答案 0 :(得分:0)
以下是解释: 你有包包含元组,所以在下面的例子中:Gietz,{()}包本身不是空的,里面有一个空元组。那么让我们测试以下输入:
Urman,{(100)}
Gietz,{()}
LAST_NAME,{}
如果您投影行李的大小,您将获得以下结果:
(Urman,1)
(Gietz,1)
(LAST_NAME,0)
Gietz的行李大小也是1,因为它包含一个元组,如果元组本身是空的,它就会出现问题。
怎么做: (这是一个有效的解决方案)
data = LOAD 'SO/name.txt' USING PigStorage(',') AS (name:chararray,b:bag{(val:int)});
DESCRIBE data;
a = FOREACH data GENERATE name AS name, b AS b, FLATTEN($1) AS x;
b = FILTER a BY x IS NULL;
DUMP a;
它转储:
(Gietz,{()},)
(Higgins,{()},)
(LAST_NAME,{()},)
(Grant,{()},)
输入:
Popp,{(100)}
Urman,{(100)}
Sciarra,{(100)}
Chen,{(100)}
Faviet,{(100)}
Gietz,{()}
Higgins,{()}
LAST_NAME,{()}
Grant,{()}