我想通过java
运行hadoop应用程序如果我在群集中使用commmand haddoop jar
运行我的应用程序,一切正常。但我需要远程运行这份工作。
我已经为资源管理器和其他属性设置了这样的配置:
jobConf.set("yarn.resourcemanager.address", "192.168.111.9:8032");
jobConf.set("mapreduce.framework.name", "yarn");
jobConf.set("fs.default.name", "hdfs://192.168.111.9:8020");
//If not set throws an error regarding to unable to write on /tmp/hadoop-yarn
jobConf.set("yarn.app.mapreduce.am.staging-dir", "/user");
jobConf.set("mapreduce.app-submission.cross-platform", "true");
jobConf.set("mapreduce.application.classpath", "$HADOOP_MAPRED_HOME/*:$HADOOP_MAPRED_HOME/lib/*:$MR2_CLASSPATH:$HADOOP_CLIENT_CONF_DIR:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_HDFS_HOME/*:$HADOOP_HDFS_HOME/lib/*:$HADOOP_YARN_HOME/*:$HADOOP_YARN_HOME/lib/*");
String target = "variables-hadoop-0.0.1-SNAPSHOT.jar";
jobConf.set("mapreduce.job.jar", target)
但每次运行应用程序时都不会访问资源管理器,日志会显示:
2017-01-25 19:36:09,998 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2017-01-25 19:36:11,032 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
持续尝试很长时间
然后我尝试设置属性
jobConf.set("yarn.resourcemanager.scheduler.address", "192.168.111.9:8030 ");
但是又抛出了另一个错误
java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 192.168.111.9:8030 (configuration property 'yarn.resourcemanager.scheduler.address')
有没有“简单的方法”来做到这一点?很难发现应该设置的每一个属性。
我正在使用Cloudera - Hadoop 2.7
运行群集答案 0 :(得分:0)
您已在调度程序地址末尾添加了空格,这就是您获取IllegalArgumentException
变化:
jobConf.set("yarn.resourcemanager.scheduler.address", "192.168.111.9:8030 ");
// ^
到
jobConf.set("yarn.resourcemanager.scheduler.address", "192.168.111.9:8030");
// ^