Spring批处理hadoop - 配置多个映射器(MultipleInput)

时间:2016-01-25 21:23:07

标签: spring hadoop mapreduce spring-batch

我需要处理两个具有不同格式的文件。我正在考虑应用如下方法:

MultipleInputs.addInputPath(job,new Path(args[0]),TextInputFormat.class,MapperOne.class);
MultipleInputs.addInputPath(job,new Path(args[1]),TextInputFormat.class,MapperTwo.class);

如何在spring hadoop中定义多个mapper?

<job id="wordcount-job"
  input-path="${wordcount.input.path:/user/input/word/}"
  output-path="${wordcount.output.path:/user/output/word/}" 
  mapper="org.apache.hadoop.examples.WordCount.TokenizerMapper1"
  reducer="org.apache.hadoop.examples.WordCount.IntSumReducer" />

请建议。

0 个答案:

没有答案