嗨有人知道为什么FILTER命令在以下代码中什么都不返回?谢谢你!
data = LOAD 'sample1.txt'
AS (campaign_id:chararray,
date:chararray,
time:chararray,
keyword:chararray,
display_site:chararray,
placement:chararray,
was_clicked:int,
cpc:int);
count1 = FOREACH (GROUP data ALL) GENERATE COUNT(data);
DUMP count1;
clicked = FILTER data BY (was_clicked==1);
DUMP clicked;
count2 = FOREACH (GROUP clicked ALL) GENERATE COUNT(clicked);
DUMP count2;
我尝试DUMP data
并看到有一些记录(was_clicked == 1)。
DUMP count1
显示(100),这是预期的。
DUMP clicked
什么也没显示。
DUMP count2
没有显示任何内容。
我以本地模式调用.pig文件:$ pig -x local analysis1.pig
答案 0 :(得分:0)
我没有在脚本中看到任何问题。它的工作正常。你可以粘贴样本输入吗?
input.txt
aaa,1234,5678,bbb,ccc,ddd,2,100
zzz,1234,5678,bbb,ccc,ddd,1,100
xxx,1234,5678,bbb,ccc,ddd,1,100
yyy,1234,5678,bbb,ccc,ddd,2,100
jjj,1234,5678,bbb,ccc,ddd,1,100
kkk,1234,5678,bbb,ccc,ddd,4,100
PigScript:
data = LOAD 'input.txt' using PigStorage(',')
AS (campaign_id:chararray,
date:chararray,
time:chararray,
keyword:chararray,
display_site:chararray,
placement:chararray,
was_clicked:int,
cpc:int);
count1 = FOREACH (GROUP data ALL) GENERATE COUNT(data);
dump count1;
clicked = FILTER data BY (was_clicked==1);
dump clicked;
count2 = FOREACH (GROUP clicked ALL) GENERATE COUNT(clicked);
dump count2;
output of count1:
(6)
Output of clicked:
(zzz,1234,5678,bbb,ccc,ddd,1,100)
(xxx,1234,5678,bbb,ccc,ddd,1,100)
(jjj,1234,5678,bbb,ccc,ddd,1,100)
Output of count2:
(3)