我是Pig Latin的新手,我正在尝试重现一个简单的SQL查询。示例输入数据表的格式为:
**A B C**
1 3 $5
2 4 $6
2 5 $7
我想计算B列中的行数并将C中的行相加。这样:
**A Count(B) Sum(C)**
1 1 $5
2 2 $13
或者在SQL中:
Select A, count(B), Sum(C)
From Data
Group by A
如何在PIG中完成此任务?
答案 0 :(得分:1)
猪脚本:
input_data = LOAD 'input.csv' USING PigStorage(',') AS (A:long, B:long, C:long);
input_data_grp_by_A = GROUP input_data BY A;
required_stats = FOREACH input_data_grp_by_A GENERATE group AS A, COUNT(input_data.B) AS COUNT_B, SUM(input_data.C) AS SUM_C;
输入:
1,3,5
2,4,6
2,5,7
输出:required_stats
(1,1,5)
(2,2,13)