我正在尝试查找年龄介于19到60之间的用户数量。以下是示例查询
loadtable = load '/user/userdetails.txt' using PigStorage(',') AS (name:chararray,age:int);
filteredvalues = filter loadtable by (age > 19 AND age < 60);
grouped = GROUP filteredvalues ALL;
count = foreach grouped generate COUNT(grouped);
我收到以下错误“无效的标量投影:已分组:需要根据关系投影列,以便将其用作标量”
答案 0 :(得分:2)
您必须计算过滤值而不是分组。
total = foreach grouped generate COUNT(filteredvalues);
答案 1 :(得分:1)
示例 userdetails.txt:
Robin,85
BOB,55
Maya,23
Sara,45
David,23
Maggy,22
Robert,75
Syam,23
Mary,25
Saran,17
Stacy,19
Kelly,22
<强>代码:强>
grunt> loadtable = load '/user/userdetails.txt' using PigStorage(',') AS (name:chararray,age:int);
grunt> filteredvalues = filter loadtable by (age > 19 AND age < 60);
grunt> grouped = GROUP filteredvalues ALL;
grunt> count = foreach grouped generate COUNT(filteredvalues);
grunt> dump count;
始终在群组关系或行李之前执行计数,否则会抛出: &#34;标量投影无效:已分组:需要投影列 从它的关系中用作标量&#34;