我们可以从Apache pig中的脚本B访问脚本A中加载的关系吗?

时间:2017-07-28 19:27:47

标签: hadoop apache-pig

我的问题是,是否有办法访问关系' data1'从script2.pig构建sript1.pig而不必再次加载数据?

script1.pig有:

static bool ResolveAttributes(IMethodSymbol methodSymbol)
{
    var attributes = methodSymbol.GetAttributes();

    return null == attributes.FirstOrDefault(attr => isIDEMessageAttribute(attr, typeof(MyAttributeType)));
}

static bool IsIDEMessageAttribute(AttributeData attribute, Type desiredAttributeType)
{
    //How can I check if the attribute is the type of desired?
}

RUN script2.pig;  EXEC;

script2.pig有:

data1 = LOAD '$some_location'USING PigStorage('\t') AS (...);

我可以访问script2.pig中的data1,而无需在script2.pig中重新加载data1吗?

1 个答案:

答案 0 :(得分:0)

我在我的项目中试图解决它并且它有效:

runner_script.pig有:

RUN script1.pig; 
EXEC;

RUN script2.pig; 
EXEC;

script1.pig有:

data1 = LOAD '$some_location'USING PigStorage('\t') AS (...);
filter1 = FILTER data1 BY <<some-condition-1>>;

script2.pig有:

filter1 = FILTER data1 BY <<some-condition-2>>;

这样我就不必两次加载data1了。