多个字段使用猪计数和求和

时间:2016-02-23 12:04:25

标签: apache-pig

我有四个带有int类型数据的字段以及数据集中的null,所以我需要计算带数据的字段数,即假设第一列和第三列有空值而第二列和第四列有整数值输出是2。 第二件事我也需要这些字段的总和,就像在上面的例子中输出是2

输入

null null null null
1    3    5    null 
null null 8    5 

输出:

0    null 
3    9
2    13

2 个答案:

答案 0 :(得分:0)

以下是如何操作的示例

 for (int i = 0; i <= userNum; i++) {
      for (int j = 0; j < i; j++) {
          System.out.print(" ");
      }
      System.out.println(i);
 }

我希望我能正确理解你的问题。

我现在也做了同样的事情:)

A = LOAD 'data.csv' USING PigStorage(',') AS (f1, f2, f3, f4);

B = FOREACH A GENERATE
    ( f1 IS NULL ? 0 : 1 ) AS f1_validity,
    ( f2 IS NULL ? 0 : 1 ) AS f2_validity,
    ( f3 IS NULL ? 0 : 1 ) AS f3_validity,
    ( f4 IS NULL ? 0 : 1 ) AS f4_validity;

C = FOREACH B GENERATE 
    f1_validity + f2_validity + f3_validity + f4_validity AS valid_field_cnt;

答案 1 :(得分:0)

A = LOAD 'test8.txt' USING PigStorage('\t') AS (a,b,c,d);
B = FOREACH A GENERATE ( a is null ? 0 : 1 ) AS a1,
                       ( b is null ? 0 : 1 ) AS b1,
                       ( c is null ? 0 : 1 ) AS c1,
                       ( d is null ? 0 : 1 ) AS d1,
                       ( a is null ? 0 : a ) AS a,
                       ( b is null ? 0 : b ) AS b,
                       ( c is null ? 0 : c ) AS c,
                       ( d is null ? 0 : d ) AS d;
C = FOREACH B GENERATE a1 + b1 + c1 + d1 as field_count,
                       ((a + b + c + d) == 0 ? null : (a + b + c + d)) as field_sum;    
DUMP C;

<强>输出

Field Count and Field Sum