我在下面执行时遇到错误:
data1 = load '/user/pig/join2_genchanA.txt' using PigStorage(',')as (showname:chararray, channelname:chararray);
data2 = load '/user/pig/join2_gennumA.txt' using PigStorage(',')as (showname:chararray, showviewer:long);
joindata = join data1 by showname, data2 by showname;
bat = filter joindata by channelname=='BAT';
foreachviewer = FOREACH bat GENERATE channelname, showviewer;
foreachgroupall = GROUP foreachviewer all; batsum = FOREACH foreachgroupall GENERATE SUM(bat.showviewers);
现在我收到以下错误:
"2017-09-15 04:01:03,517 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: **Pig script failed to parse**: <line 28, column 46> Invalid scalar projection: bat Details at logfile: /home/cloudera/pig_1504878875671.log"
请帮助我。
答案 0 :(得分:0)
它说别名foreachgroupall中没有现场击球。
对foreachgroupall执行DESCRIBE,它将显示字段名称。
将代码段中的最后一行替换为以下一行 -
data1 = load '/user/pig/join2_genchanA.txt' using PigStorage(',') as (showname:chararray, channelname:chararray);
data2 = load '/user/pig/join2_gennumA.txt' using PigStorage(',') as (showname:chararray, showviewer:long);
bat = FILTER data1 BY channelname=='BAT';
joindata = join bat by showname, data2 by showname;
foreachviewer = FOREACH joindata GENERATE bat::channelname AS channelname, data2::showviewer AS showviewer;
req_stats = FOREACH(GROUP foreachviewer ALL) GENERATE SUM(foreachviewer.showviewer);
DUMP req_stats;
我建议您在读取数据集之后先过滤,而不是在连接后过滤通道名称BAT。有关详细信息,请参阅下面的代码段 -
{{1}}