如何定义一个常量数组并检查一个值是否在Pig Latin

时间:2015-08-31 20:53:10

标签: apache-pig

我想在Pig中定义一个用户ID数组,然后如果输入中的userId不在该数组中,则过滤数据, 我怎么做猪拉丁?以下是我打算做的例子

由于

inputData = load '$INPUT' USING PigStorage('|') AS (useriD:chararray,controllerAction:chararray,url:chararray,browserName:chararray,IsMobile:chararray,exceptionDetails:chararray,renderTime:int,serviceHostId:int,auditEventTime:chararray);
filteredInput = filter inputData by controllerAction is not null and auditEventTime is not null and serviceHostId is not null and renderTime is not null and useriD in ('2be2df06-f4ba-4d87-8938-09d867d3f2fe','ac1ac6bf-d151-49fc-8c7c-2b52d2efbb58','f00aec16-36e5-46ae-b7cb-a0f1eeefe609','258890f9-102a-4f8e-a001-ae24d2e25269','cf221779-a077-472c-b377-cca4a9230e1b');

感谢Murali ..我尝试了声明变量然后使用Flatten和stringSplit加入的方法。但是我得到以下错误  语法错误,flatteneduserids等附近的意外符号'

%declare REQUIRED_USER_IDS 'xxxxx,yyyyy,sssss' ;

inputData = load '$INPUT' USING PigStorage('|') AS (useriD:chararray,controllerAction:chararray,url:chararray,browserName:chararray,IsMobile:chararray,exceptionDetails:chararray,renderTime:int,serviceHostId:int,auditEventTime:chararray);

filteredInput = filter inputData by controllerAction is not null and auditEventTime is not null and serviceHostId is not null and renderTime is not null;

flatteneduserids = FLATTEN(STRSPLIT('$REQUIRED_USER_IDS',',')) AS (uid:chararray);

useridfilter = JOIN filteredInput BY useriD, flatteneduserids BY  uid USING 'replicated';

所以现在我尝试了另一种声明flatteneduserids的方式,导致错误未定义的别名:IN

flatteneduserids = FOREACH IN GENERATE FLATTEN(STRSPLIT('$REQUIREDUSERIDS',',')) AS (uid:chararray);

2 个答案:

答案 0 :(得分:0)

有一个类似的用例。通过在%define中声明常量值并访问IN子句中的相同内容来尝试该方法,无法实现目标。 (参见:Declare a comma seperated string constant

值得考虑的想法......

如果IN子句中的条件是静态/引用/元类数据,那么建议在静态文件中声明它。然后,我们可以在运行时读取数据,并使用输入数据进行内部联接以检索匹配的记录。

function hidePace() {
  $height = $(window).height();
  var a = function() {
    $('.pace-progress,.velo2').delay(500).animate({
      opacity: 0},{
      queue: false,
      duration:0
    });
    return $('.logoL,.logoLback').animate({
      top: '-' + $height,
      opacity:0 },{
      queue: false,
      duration: 3000
    }).promise("fx")
};

var b = function() {
  return $('.velo').delay(700).animate({
    opacity: 0
  }, 1000).promise("fx")
};


var c = function() {
  return $('.menuOff').delay(3000).animate({
    width: $height - 260
  }, {
    queue: false,
    done: function() {
      $('.logoL,.logoLback,.velo,.velo2').remove();
    }
  }).promise("fx")
};

return $.when(a(), b(), c())
}

function showFullMenu() {
  setTimeout(function() {
    if (mobileView === 0) {
      $('.textc').animate({
        width: 148},{
        queue: false,
        duration:1200, 
        easing:"easeInQuart"
      });
    }
  }, 1500);

  setTimeout(function() {
    if (mobileView === 0) {
      $('.textc,.menu').css('border', 'none').animate({
        height: $height - 210},{
        queue: false,
        duration: 1000,
        easing:"easeInQuart", 
        done: function() {
        $('.sub1').css('top', $('.multilevel[data-id=1]').offset().top);
        $('.sub2').css('top', $('.multilevel[data-id=2]').offset().top);
        $('.sub3').css('top', $('.multilevel[data-id=3]').offset().top);
      }});
    }

    $('.menuOff').animate({
      'width': $height - 260},{
      queue: false
    });
  }, 3000);
}


function startIndex() {
  hidePace().done(showFullMenu);
}

答案 1 :(得分:0)

我无法弄清楚如何在记忆中做到这一点 因此,根据Murali的建议,我在文件中添加了用户ID ..加载文件,然后进行加入......按预期为mr工作