PIG:如何仅打印Grouped Bag中的某些属性

时间:2016-12-13 11:34:44

标签: apache-pig

我有一个分组结果,看起来如下所示:

| grouped     | group:chararray    | log:bag{:tuple(driverId:chararray,truckId:chararray,eventTime:chararray,eventType:chararray,longitude:chararray,latitude:chararray,eventKey:chararray,CorrelationId:chararray,driverName:chararray,routeId:chararray,routeName:chararray,eventDate:chararray)}

当我在下面表演时:

x = FOREACH grouped GENERATE {log.driverId, log.truckId, log.driverName};
illustrate x;

输出的结果是:

| x     | :bag{:tuple(:bag{:tuple(driverId:chararray)})}                           |
------------------------------------------------------------------------------------
|       | {({(11), (11)}), ({(74), (39)}), ({(Jamie Engesser), (Jamie Engesser)})} |
------------------------------------------------------------------------------------

我期望的地方是:

{({(11, 74, Jamie Engesser), (11,39,Jamie Engesser)})

1 个答案:

答案 0 :(得分:0)

获得解决方案

Group是一个元组,相邻结果是Bag我必须使用嵌套FOREACH,如下所示:

x = FOREACH grouped{
        val1 = group;
        vals = FOREACH log GENERATE driverId, truckId, driverName;
        GENERATE val1, vals;
        };

因此,这只选择了给定结果中的必需属性。

如果有人知道更好/最佳/更简单的方法,请发表评论。

由于