Pig - 如何一步加入和定义架构

时间:2014-04-15 09:53:44

标签: hadoop apache-pig bigdata cloudera

我诉诸于以下内容:

A = LOAD 'a.txt' USING PigStorage('\\u001') AS (
    foo:int
    ,bar:chararray
);
B = LOAD 'b.txt' USING PigStorage('\\u001') AS (
    foo:int
    ,baz:long
);
C = JOIN A BY foo, B BY foo;
D = FOREACH C GENERATE
    A::foo AS foo
    ,A::bar AS bar
    ,B::baz AS baz
;

如何在一个步骤中加入和定义架构?

1 个答案:

答案 0 :(得分:3)

根据documentation,您无法在加入关系时定义架构 注意: 从语法上讲,您可以嵌套命令,让您感觉保存了一些步骤,如:

D = foreach
    (join (LOAD 'a.txt' USING PigStorage('\\u001') AS (foo:int ,bar:chararray)) by foo,
          (LOAD 'b.txt' USING PigStorage('\\u001') AS (foo:int ,baz:long)) by foo
    ) generate $0 as foo, $1 as bar, $3 as baz;

但我会避免这样做。它很混乱,但它会产生与原始解释计划相同的解释计划。