我在合并两个管道时遇到了问题。在更高的层面上,问题完全归结为here。
我有两个管道Output1
和Output2
,我从那里保留了两个字段,将它们重命名为一个模式并合并。
Pipe Output1Retain = new Pipe("Output1Retain", new Retain(Output1, new Fields("output1fieldA", "output1fieldB")));
Pipe Output2Retain = new Pipe("Output2Retain", new Retain(Output2, new Fields("output2fieldA", "output2fieldB")));
Fields finalFields = new Fields("fieldA", "fieldB");
Pipe Output1Rename = new Pipe("Output1Rename",new Rename(Output1Retain, Fields.ALL, finalFields ));
Pipe Output2Rename = new Pipe("Output2Rename",new Rename(Output2Retain, Fields.ALL, finalFields ));
Pipe FinalOutput = new Merge("MergePipe",Output1Rename, Output2Rename);
上述代码在级联作业开始之前从flowplanner给出了以下错误:
union of steps have 2 fewer elements than parent assembly MapReduceHadoopRuleRegistry, missing: [Each(Pass fileA through*Pass fileB through)[FilterNotNull[decl:ALL]], Each(Pass fileA through*Pass fileB through)[Identity[decl:ARGS]]]`
我看到这个问题在Cascading 3.1.0中得到修复,我使用的是3.0.2,遗憾的是不能使用3.1.0。我尝试在Checkpoint
的输入之前使用Merge
,但它没有帮助。我已经详细描述了Output1
和Output2
的准备工作。
这个问题有什么出路吗?