Pig Latin为这样的SUM()函数输出?

时间:2017-06-24 20:26:54

标签: hadoop mapreduce apache-pig

我有一些数据,如(姓名,得分) 一个10 B 25 C 15 A 5 一个36 B 98 C 78 C 78 B 12

data = LOAD 'demo.txt'  using PigStorage (',') as (name : chararray , score : int);
groupScore = GROUP data by score;
totalscore = FOREACH groupScore Generate data.name , SUM(data.score);

当我使用SUM()函数时,输出就像

一样
{(A)(A)(A), (51)} 
{(B)(B)(B), (135)}

我想知道无论如何我都能表现出来像

{(A), (51)},

每次出现都不重复“名称”字段?
任何指导都会有所帮助。

2 个答案:

答案 0 :(得分:3)

以下是解决方案的查询

data = LOAD 'demo.txt' USING PigStorage(',') AS (name:chararray,score:int);
groupScore = group data by name;
result= FOREACH groupScore GENERATE group,SUM(data.score);

输出

  

(A,51)(B,135)(C,171)

答案 1 :(得分:0)

按名称分组

data = LOAD 'demo.txt'  as PigStorage (',') using (name : chararray , score : int);
groupScore = GROUP data by name;
totalscore = FOREACH groupScore Generate data.name , SUM(data.score);