我是猪的新手,我正在玩它并走到路障。
想象一下,我有以下内容:
dump test;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
(3,null)
(4,null)
我想过滤“test”以删除空值,所以我会这样做:
filter_test = filter test by test.column2 is not null;
给我这样的东西:
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
但它返回同样的东西。它不会删除空行。
我正在使用Pig 10,日期列的类型为chararray。
感谢您的帮助。
答案 0 :(得分:1)
你的column2没有空值,它是一个chararray。请查看实际空值的示例,并将其视为chararray。
示例1:null为chararray
的 input.txt中强>
1,2014-04-08 12:09:23.0
2,2014-04-08 12:09:23.0
3,null
4,null
<强>猪:强>
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2!='null';
DUMP B;
<强>输出:强>
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
示例2:实际空值
input.txt
1,2014-04-08 12:09:23.0
2,2014-04-08 12:09:23.0
3,
4,
<强>猪:强>
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2 is not null;
DUMP B;
<强>输出:强>
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)