我是Giraph和Hadoop Yarn的新手。以下Giraph的quick start引导我从命令行的源代码中运行jar构建的示例作业。
我想从简单的java程序中运行该作业。这个问题的灵感来自之前类似的MapReduce作业question。寻找与java相关的类似答案。
我在本地设置了纱线 - 需要一种方法来从java程序中提供作业。
很明显:https://giraph.apache.org/apidocs/org/apache/giraph/job/GiraphJob.html必须有办法解决这个问题 - 但我发现用纱线很难找到它的例子。
答案 0 :(得分:0)
从GiraphRunner的源代码中找到了一种方法:
@Test
public void testPageRank() throws IOException, ClassNotFoundException, InterruptedException {
GiraphConfiguration giraphConf = new GiraphConfiguration(getConf());
giraphConf.setWorkerConfiguration(1,1,100);
GiraphConstants.SPLIT_MASTER_WORKER.set(giraphConf, false);
giraphConf.setVertexInputFormatClass(JsonLongDoubleFloatDoubleVertexInputFormat.class);
GiraphFileInputFormat.setVertexInputPath(giraphConf,
new Path("/input/tiny-graph.txt"));
giraphConf.setVertexOutputFormatClass(IdWithValueTextOutputFormat.class);
giraphConf.setComputationClass(PageRankComputation.class);
GiraphJob giraphJob = new GiraphJob(giraphConf, "page-rank");
FileOutputFormat.setOutputPath(giraphJob.getInternalJob(),
new Path("/output/page-rank2"));
giraphJob.run(true);
}
private Configuration getConf() {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://localhost:9000");
conf.set("yarn.resourcemanager.address", "localhost:8032");
conf.set("yarn.resourcemanager.hostname", "localhost");
// framework is now "yarn", should be defined like this in mapred-site.xm
conf.set("mapreduce.framework.name", "yarn");
return conf;
}