我正在使用带有0减少的hadoop。目标是在map
方法中以增量方式创建对象。然后在某些时候将(序列化)写入输出文件夹。就像我说reduce
一块在这里什么都不做。我该怎么做呢?这就是我所拥有的:
在configure方法中,我获得了文件的路径:
@Override
public void configure(JobConf conf) {
taskSideEffectFile = FileOutputFormat.getWorkOutputPath(conf) + "/temp";
}
在map方法中,我正在构建我的对象,最终我想序列化它,因为我现在总是试图在map方法上写它:
@Override
public void map(LongWritable key, Text value,
OutputCollector<Text, IntWritable> output, Reporter reporter)
throws IOException {
AddInstanceToClassifier(value.toString());
try
{
//serialize classifier
weka.core.SerializationHelper.write( taskSideEffectFile, nb);
}
catch (Exception ex)
{
System.err.println("Failed to serialize classifier: " + ex.getMessage());
throw new IOException("taskSideEffectFile: " + ex.getMessage());
}
}
这是我得到的错误:
12/05/09 22:47:00 INFO mapred.JobClient: map 0% reduce 0%
12/05/09 22:47:08 INFO mapred.JobClient: Task Id : attempt_201205091117_0015_m_000001_0, Status : FAILED
java.io.IOException: taskSideEffectFile: hdfs:/192.168.78.129:9000/user/hadoop-user/output/_temporary/_attempt_201205091117_0015_m_000001_0/temp (No such file or directory)
at naive.bayes.hadoop.MusicClassifierMapper.SaveClassifier(MusicClassifierMapper.java:168)
at naive.bayes.hadoop.MusicClassifierMapper.map(MusicClassifierMapper.java:121)
at naive.bayes.hadoop.MusicClassifierMapper.map(MusicClassifierMapper.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
注意:我正在使用雅虎的hadoop-0.18.0(我认为这是我从eclipse运行应用程序的唯一方法)
答案 0 :(得分:1)
Hadoop应该存储您的临时文件,然后在任务成功时将它们“提升”到输出文件夹。