猪过滤器或操作员

时间:2016-11-18 15:10:47

标签: apache-pig

a = load '/user/home/samp.txt' using PigStorage(',');
dump a;
(2008-Jan-12,12.1,13.1,36.0)
(2008-Jan-13,13.1,14.1,45.00)
(2008-Jan-15,14.2,15.2,47.00)
(2008-Jan-16,16.1,17.1,47.5)
(2008-Jan-12,8.5,17,50,12.0)
(2008-Jan-12,n#/a,n#/a,n#/a)
(2008-Jan-19,n#/a,n#/a,n#/a)
(2008-Jan-12,n#/a,n#/a,27)
(2008-Jan-12,n#/a,13.00,n#/a)
b = filter a by ($1!='n#/a' OR $2!='n#/a' OR $3!='n#/a');
dump b;
(2008-Jan-12,12.1,13.1,36.0)
(2008-Jan-13,13.1,14.1,45.00)
(2008-Jan-15,14.2,15.2,47.00)
(2008-Jan-16,16.1,17.1,47.5)
(2008-Jan-12,8.5,17,50,12.0)
(2008-Jan-12,n#/a,n#/a,27)
(2008-Jan-12,n#/a,13.00,n#/a)

为什么我仍然在b

中获得"n#/a"

1 个答案:

答案 0 :(得分:2)

结果符合预期,因为您正在使用!=和OR.You正在获得"n#/a"行,因为至少有一个条件适用于(2008-Jan-12,n#/a,n#/a,27)(2008-Jan-12,n#/a,13.00,n#/a)

如果您想过滤没有"n#/a"的行,请使用AND

B = FILTER A BY (($1 != 'n#/a') AND ($2 != 'n#/a' ) AND ($3 != 'n#/a' ));

如果你想使用OR然后合并逻辑OR resutls然后否定

B = FILTER A BY NOT($1 == 'n#/a' OR $2 == 'n#/a' OR $3 == 'n#/a');

OR

B = FILTER A BY NOT($1 matches 'n#/a' OR $2 matches 'n#/a' OR $3 matches 'n#/a');

<强>输出

enter image description here