我有一个程序打印平衡的平均值并计算客户数量。一切正常,直到我注意到部分-r-0000文件为空。这很奇怪,因为我没有改变任何东西hadoop配置。我将发布cmd下面的堆栈跟踪
17/04/14 14:21:31 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
17/04/14 14:21:31 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
17/04/14 14:21:31 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/04/14 14:21:31 INFO input.FileInputFormat: Total input paths to process : 1
17/04/14 14:21:31 INFO mapreduce.JobSubmitter: number of splits:1
17/04/14 14:21:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1656799721_0001
17/04/14 14:21:32 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
17/04/14 14:21:32 INFO mapreduce.Job: Running job: job_local1656799721_0001
17/04/14 14:21:32 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/04/14 14:21:32 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/14 14:21:32 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Waiting for map tasks
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Starting task: attempt_local1656799721_0001_m_000000_0
17/04/14 14:21:32 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/14 14:21:32 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
17/04/14 14:21:32 INFO mapred.Task: Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@7c8cb1b6
17/04/14 14:21:32 INFO mapred.MapTask: Processing split: hdfs://localhost:19000/datagen/data/customer.tbl:0+2411114
17/04/14 14:21:32 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/14 14:21:32 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/14 14:21:32 INFO mapred.MapTask: soft limit at 83886080
17/04/14 14:21:32 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/14 14:21:32 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/14 14:21:32 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/14 14:21:32 INFO mapred.LocalJobRunner:
17/04/14 14:21:32 INFO mapred.MapTask: Starting flush of map output
17/04/14 14:21:32 INFO mapred.Task: Task:attempt_local1656799721_0001_m_000000_0 is done. And is in the process of committing
17/04/14 14:21:32 INFO mapred.LocalJobRunner: map
17/04/14 14:21:32 INFO mapred.Task: Task 'attempt_local1656799721_0001_m_000000_0' done.
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Finishing task: attempt_local1656799721_0001_m_000000_0
17/04/14 14:21:32 INFO mapred.LocalJobRunner: map task executor complete.
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Waiting for reduce tasks
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Starting task: attempt_local1656799721_0001_r_000000_0
17/04/14 14:21:32 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/14 14:21:32 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
17/04/14 14:21:32 INFO mapred.Task: Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@25135c4c
17/04/14 14:21:32 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@2d7e552d
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464, maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10
17/04/14 14:21:32 INFO reduce.EventFetcher: attempt_local1656799721_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
17/04/14 14:21:32 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1656799721_0001_m_000000_0 decomp: 2 len: 6 to MEMORY
17/04/14 14:21:32 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1656799721_0001_m_000000_0
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->2
17/04/14 14:21:32 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
17/04/14 14:21:32 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
17/04/14 14:21:32 INFO mapred.Merger: Merging 1 sorted segments
17/04/14 14:21:32 INFO mapred.Merger: Down to the last merge-pass, with 0 segments left of total size: 0 bytes
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: Merged 1 segments, 2 bytes to disk to satisfy reduce memory limit
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: Merging 1 files, 6 bytes from disk
17/04/14 14:21:32 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
17/04/14 14:21:32 INFO mapred.Merger: Merging 1 sorted segments
17/04/14 14:21:32 INFO mapred.Merger: Down to the last merge-pass, with 0 segments left of total size: 0 bytes
17/04/14 14:21:32 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/04/14 14:21:32 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
17/04/14 14:21:32 INFO mapred.Task: Task:attempt_local1656799721_0001_r_000000_0 is done. And is in the process of committing
17/04/14 14:21:32 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/04/14 14:21:32 INFO mapred.Task: Task attempt_local1656799721_0001_r_000000_0 is allowed to commit now
17/04/14 14:21:32 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1656799721_0001_r_000000_0' to hdfs://localhost:19000/out19/_temporary/0/task_local1656799721_0001_r_000000
17/04/14 14:21:32 INFO mapred.LocalJobRunner: reduce > reduce
17/04/14 14:21:32 INFO mapred.Task: Task 'attempt_local1656799721_0001_r_000000_0' done.
17/04/14 14:21:32 INFO mapred.LocalJobRunner: Finishing task: attempt_local1656799721_0001_r_000000_0
17/04/14 14:21:32 INFO mapred.LocalJobRunner: reduce task executor complete.
17/04/14 14:21:33 INFO mapreduce.Job: Job job_local1656799721_0001 running in uber mode : false
17/04/14 14:21:33 INFO mapreduce.Job: map 100% reduce 100%
17/04/14 14:21:33 INFO mapreduce.Job: Job job_local1656799721_0001 completed successfully
17/04/14 14:21:33 INFO mapreduce.Job: Counters: 35
File System Counters
FILE: Number of bytes read=17482
FILE: Number of bytes written=591792
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=4822228
HDFS: Number of bytes written=0
HDFS: Number of read operations=13
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Map-Reduce Framework
Map input records=15000
Map output records=0
Map output bytes=0
Map output materialized bytes=6
Input split bytes=113
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=6
Reduce input records=0
Reduce output records=0
Spilled Records=0
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=0
Total committed heap usage (bytes)=546308096
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=2411114
File Output Format Counters
Bytes Written=0
代码
public static class TokenizerMapper extends Mapper<LongWritable, Text,Text ,Text>{
private Text segment = new Text();
//private ThreeWritableValues cust = new ThreeWritableValues();
private Text word = new Text();
private float balance = 0;
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String[] line = value.toString().split("\\|");
String cust_key = line[1];
int nation = Integer.parseInt(line[3]);
if((balance > 8000) && ( nation < 15) && (nation > 1)){
segment.set(line[6]);
word.set(cust_key+","+balance);
context.write(segment,word);
}
}
}
public static class AvgReducer extends Reducer<Text,Text,Text,Text> {
public void reduce(Text key, Iterable<Text> values,Context context) throws IOException, InterruptedException {
context.write(key, values.iterator().next());
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(MapReduceTest.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(AvgReducer.class);
job.setReducerClass(AvgReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
如果有人知道,请帮助。
答案 0 :(得分:4)
地图阶段没有生成输出
Map output records=0
Map output bytes=0
在TokenizerMapper
班级中,balance
的值定义为0
。
private float balance = 0;
并且在map
方法中,balance
的值仍为0
但已检查> 8000
。
if((balance > 8000) && ( nation < 15) && (nation > 1)){
segment.set(line[6]);
word.set(cust_key+","+balance);
context.write(segment,word);
}
永远不会遇到if
条件,因此没有映射器输出,也没有减速器输出。