MapReduce输出文件为空

时间:2017-04-14 11:29:16

标签: java mapreduce

我有一个程序打印平衡的平均值并计算客户数量。一切正常,直到我注意到部分-r-0000文件为空。这很奇怪,因为我没有改变任何东西hadoop配置。我将发布cmd下面的堆栈跟踪

17/04/14 14:21:31 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
17/04/14 14:21:31 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
17/04/14 14:21:31 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/04/14 14:21:31 INFO input.FileInputFormat: Total input paths to process : 1
17/04/14 14:21:31 INFO mapreduce.JobSubmitter: number of splits:1
17/04/14 14:21:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1656799721_0001
17/04/14 14:21:32 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
17/04/14 14:21:32 INFO mapreduce.Job: Running job: job_local1656799721_0001
17/04/14 14:21:32 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/04/14 14:21:32 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/14 14:21:32 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Waiting for map tasks
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Starting task: attempt_local1656799721_0001_m_000000_0
17/04/14 14:21:32 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/14 14:21:32 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
17/04/14 14:21:32 INFO mapred.Task:  Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@7c8cb1b6
17/04/14 14:21:32 INFO mapred.MapTask: Processing split: hdfs://localhost:19000/datagen/data/customer.tbl:0+2411114
17/04/14 14:21:32 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/14 14:21:32 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/14 14:21:32 INFO mapred.MapTask: soft limit at 83886080
17/04/14 14:21:32 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/14 14:21:32 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/14 14:21:32 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/14 14:21:32 INFO mapred.LocalJobRunner:
17/04/14 14:21:32 INFO mapred.MapTask: Starting flush of map output
17/04/14 14:21:32 INFO mapred.Task: Task:attempt_local1656799721_0001_m_000000_0 is done. And is in the process of committing
17/04/14 14:21:32 INFO mapred.LocalJobRunner: map
17/04/14 14:21:32 INFO mapred.Task: Task 'attempt_local1656799721_0001_m_000000_0' done.
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Finishing task: attempt_local1656799721_0001_m_000000_0
17/04/14 14:21:32 INFO mapred.LocalJobRunner: map task executor complete.
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Waiting for reduce tasks
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Starting task: attempt_local1656799721_0001_r_000000_0
17/04/14 14:21:32 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/14 14:21:32 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
17/04/14 14:21:32 INFO mapred.Task:  Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@25135c4c
17/04/14 14:21:32 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@2d7e552d
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464, maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10
17/04/14 14:21:32 INFO reduce.EventFetcher: attempt_local1656799721_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
17/04/14 14:21:32 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1656799721_0001_m_000000_0 decomp: 2 len: 6 to MEMORY
17/04/14 14:21:32 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1656799721_0001_m_000000_0
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->2
17/04/14 14:21:32 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
17/04/14 14:21:32 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
17/04/14 14:21:32 INFO mapred.Merger: Merging 1 sorted segments
17/04/14 14:21:32 INFO mapred.Merger: Down to the last merge-pass, with 0 segments left of total size: 0 bytes
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: Merged 1 segments, 2 bytes to disk to satisfy reduce memory limit
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: Merging 1 files, 6 bytes from disk
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
17/04/14 14:21:32 INFO mapred.Merger: Merging 1 sorted segments
17/04/14 14:21:32 INFO mapred.Merger: Down to the last merge-pass, with 0 segments left of total size: 0 bytes
17/04/14 14:21:32 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/04/14 14:21:32 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
17/04/14 14:21:32 INFO mapred.Task: Task:attempt_local1656799721_0001_r_000000_0 is done. And is in the process of committing
17/04/14 14:21:32 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/04/14 14:21:32 INFO mapred.Task: Task attempt_local1656799721_0001_r_000000_0 is allowed to commit now
17/04/14 14:21:32 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1656799721_0001_r_000000_0' to hdfs://localhost:19000/out19/_temporary/0/task_local1656799721_0001_r_000000
17/04/14 14:21:32 INFO mapred.LocalJobRunner: reduce > reduce
17/04/14 14:21:32 INFO mapred.Task: Task 'attempt_local1656799721_0001_r_000000_0' done.
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Finishing task: attempt_local1656799721_0001_r_000000_0
17/04/14 14:21:32 INFO mapred.LocalJobRunner: reduce task executor complete.
17/04/14 14:21:33 INFO mapreduce.Job: Job job_local1656799721_0001 running in uber mode : false
17/04/14 14:21:33 INFO mapreduce.Job:  map 100% reduce 100%
17/04/14 14:21:33 INFO mapreduce.Job: Job job_local1656799721_0001 completed successfully
17/04/14 14:21:33 INFO mapreduce.Job: Counters: 35
        File System Counters
                FILE: Number of bytes read=17482
                FILE: Number of bytes written=591792
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=4822228
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=13
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=4
        Map-Reduce Framework
                Map input records=15000
                Map output records=0
                Map output bytes=0
                Map output materialized bytes=6
                Input split bytes=113
                Combine input records=0
                Combine output records=0
                Reduce input groups=0
                Reduce shuffle bytes=6
                Reduce input records=0
                Reduce output records=0
                Spilled Records=0
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=0
                Total committed heap usage (bytes)=546308096
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=2411114
        File Output Format Counters
                Bytes Written=0

代码

public static class TokenizerMapper extends Mapper<LongWritable, Text,Text ,Text>{

         private Text segment = new Text();

         //private ThreeWritableValues cust = new ThreeWritableValues();

         private Text word = new Text();

         private float balance = 0;

         public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
           String[] line = value.toString().split("\\|");

           String cust_key = line[1];

           int nation = Integer.parseInt(line[3]);

           if((balance > 8000) && ( nation < 15) && (nation > 1)){ 

             segment.set(line[6]);

             word.set(cust_key+","+balance);

             context.write(segment,word);
           }
         }

       }

    public static class AvgReducer extends Reducer<Text,Text,Text,Text> {


    public void reduce(Text key, Iterable<Text> values,Context context) throws IOException, InterruptedException {


          context.write(key, values.iterator().next());

     }

   }


      public static void main(String[] args) throws Exception {
            Configuration conf = new Configuration();
            Job job = Job.getInstance(conf, "word count");
            job.setJarByClass(MapReduceTest.class);
            job.setMapperClass(TokenizerMapper.class);
            job.setCombinerClass(AvgReducer.class);
            job.setReducerClass(AvgReducer.class);
            job.setOutputKeyClass(Text.class);
            job.setOutputValueClass(Text.class);
            FileInputFormat.addInputPath(job, new Path(args[0]));
            FileOutputFormat.setOutputPath(job, new Path(args[1]));
            System.exit(job.waitForCompletion(true) ? 0 : 1);
          }
}

如果有人知道,请帮助。

1 个答案:

答案 0 :(得分:4)

地图阶段没有生成输出

Map output records=0
Map output bytes=0

TokenizerMapper班级中,balance的值定义为0

private float balance = 0;

并且在map方法中,balance的值仍为0但已检查> 8000

if((balance > 8000) && ( nation < 15) && (nation > 1)){     
             segment.set(line[6]);
             word.set(cust_key+","+balance);    
             context.write(segment,word);
           }

永远不会遇到if条件,因此没有映射器输出,也没有减速器输出。