猪拉丁语查询总和和计数

时间:2015-08-05 03:32:28

标签: apache-pig

我是Pig Latin的新手,我正在尝试重现一个简单的SQL查询。示例输入数据表的格式为:

**A   B  C**
  1   3  $5
  2   4  $6
  2   5  $7

我想计算B列中的行数并将C中的行相加。这样:

**A   Count(B)   Sum(C)**
  1   1          $5
  2   2          $13

或者在SQL中:

Select A, count(B), Sum(C)
From Data
Group by A

如何在PIG中完成此任务?

1 个答案:

答案 0 :(得分:1)

猪脚本:

input_data = LOAD 'input.csv' USING PigStorage(',') AS (A:long, B:long, C:long);
input_data_grp_by_A = GROUP input_data BY A;
required_stats = FOREACH input_data_grp_by_A GENERATE group AS A, COUNT(input_data.B) AS COUNT_B, SUM(input_data.C) AS SUM_C;

输入:

1,3,5
2,4,6
2,5,7

输出:required_stats

(1,1,5)
(2,2,13)