Hadoop Spring在工作中有多个输入和映射器

时间:2014-11-11 13:15:57

标签: java spring hadoop

如何在作业标签中指定多个输入文件及其各自的格式?

<?xml version="1.0" encoding="UTF-8"?>
<beans:beans xmlns="http://www.springframework.org/schema/hadoop"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xmlns:beans="http://www.springframework.org/schema/beans"
     xmlns:context="http://www.springframework.org/schema/context"
     xsi:schemaLocation="http://www.springframework.org/schema/beans             http://www.springframework.org/schema/beans/spring-beans.xsd
       http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
       http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd">

   <context:property-placeholder location="hadoop.properties"/>

   <configuration>
      fs.default.name=${hd.fs}
      yarn.resourcemanager.address=${hd.rm}
      mapreduce.framework.name=${mr.fw}
   </configuration>

   <job id="wordcountJob"
      input-path="${wordcount.input.path}" 
      output-path="${wordcount.output.path}"
      mapper="org.apache.hadoop.examples.WordCount.TokenizerMapper"
      reducer="org.apache.hadoop.examples.WordCount.IntSumReducer"/>
</beans:beans>

就像我们可以在简单的java程序中指定一样。

MultipleInputs.addInputPath(job, firstPath, FirstInputFormat.class, FirstMap.class);
MultipleInputs.addInputPath(job, sencondPath, SecondInputFormat.class, SecondMap.class);

即使我检查了它的xsd文件,我也很瞪眼。我没有找到任何属性,所以我们如何在作业中指定多个输入?

0 个答案:

没有答案