我正在编写一个java客户端程序,它可以与远程hadoop集群交互并打印所有正在运行的作业。
我的本地计算机能够ping运行hadoop的远程计算机。 我尝试下面的代码,并坚持参数。我在哪里可以获得这些配置参数值
Configuration conf = new Configuration();
// this should be like defined in your mapred-site.xml
conf.set("mapred.job.tracker", "Hadoopmaster:54311");
// like defined in hdfs-site.xml
conf.set("fs.default.name", "hdfs://namenode.com:9000");
System.out.println("got configuration : "+conf);
InetSocketAddress jobtracker = new InetSocketAddress("jobtracker.mapredhost.myhost", 8021);
JobClient jobClient = new JobClient(jobtracker, conf);
JobStatus[] jobs = jobClient.jobsToComplete();
for (int i = 0; i < jobs.length; i++) {
JobStatus js = jobs[i];
if (js.getRunState() == JobStatus.RUNNING) {
JobID jobId = js.getJobID();
System.out.println(jobId);
}
}
}
地图red.xml
<property>
<name>mapred.job.tracker</name>
<value>Hadoopmaster:54311</value>
<description>test </description>
</property>
芯-site.xml中
<property>
<name>fs.default.name</name>
<value>hdfs://Hadoopmaster:54310</value>
<description>test.</description>
</property>
答案 0 :(得分:0)
在提供的示例中,配置属性与客户端和群集不匹配,fs.default.name
要通过java连接到远程hadoop集群,我做了以下代码:
public static void main(String a[]) {
UserGroupInformation ugi
= UserGroupInformation.createRemoteUser("root");
try {
ugi.doAs(new PrivilegedExceptionAction<Void>() {
public Void run() throws Exception {
conf = new Configuration();
//fs.default.name should match the corresponding value
// in your core-site.xml in hadoop cluster
conf.set("fs.default.name","hdfs://hostname:9000");
conf.set("hadoop.job.ugi", "root");
// in case you are running mapreduce job , need to set
// 'mapred.job.tracker' as you did
conf.set("mapred.job.tracker", "hostname:port");
// do your code here.
return null;
}
});
} catch (Exception e) {
e.printStackTrace();
}
}
如果您需要默认值,请检查:
http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-common/core-default.xml
和