我正在使用HDInsight .NET Hadoop API在asp.net应用程序中提交Map Reduce作业。
使用Microsoft.Hadoop.Mapreduce;
var hadoop = Hadoop.Connect();
var result = hadoop.MapReduceJob.ExecuteJob();
//也尝试了这个,但同样的异常
// var result = hadoop.MapReduceJob.ExecuteJob(config);
ExecuteJob()调用失败并在运行时抛出异常。这个世界上的任何人都能成功地运行这个电话。是否可以通过添加更多输入参数或对象来自定义Map()函数(除了Microsoft在MapperBase类中给出的除外)? Mapper和Reducer方法中的逻辑可以访问缓存/数据库吗?
答案 0 :(得分:1)
使用HDInsight .NET SDK提交MapReduce作业的示例在此处发布:
// Define the MapReduce job
MapReduceJobCreateParameters mrJobDefinition = new MapReduceJobCreateParameters()
{
JarFile = "wasb:///example/jars/hadoop-examples.jar",
ClassName = "wordcount"
};
mrJobDefinition.Arguments.Add("wasb:///example/data/gutenberg/davinci.txt");
mrJobDefinition.Arguments.Add("wasb:///example/data/WordCountOutput");
// Get the certificate object from certificate store using the friendly name to identify it
X509Store store = new X509Store();
store.Open(OpenFlags.ReadOnly);
X509Certificate2 cert = store.Certificates.Cast<X509Certificate2>().First(item => item.FriendlyName == certfrientlyname);
JobSubmissionCertificateCredential creds = new JobSubmissionCertificateCredential(new Guid(subscriptionID), cert, clusterName);
// Create a hadoop client to connect to HDInsight
var jobClient = JobSubmissionClientFactory.Connect(creds);
// Run the MapReduce job
JobCreationResults mrJobResults = jobClient.CreateMapReduceJob(mrJobDefinition);
// Wait for the job to complete
WaitForJobCompletion(mrJobResults, jobClient);