我已经编写了我的第一个map reduce程序,当我在eclipse中运行它时,它会写入输出文件并按预期工作。但是,当我使用hadoop jar myjar.jar从命令行运行它时,结果不会写入输出文件。输出文件(_SUCCESS和part-r-0000)正在创建,但它们是空的。有持久性问题吗?减少输入记录= 12但减少输出记录= 0。但如果我在日食中这样做,那就不是零。在eclipse中,reduce输出记录不是0.任何帮助都表示赞赏。感谢
[cloudera@quickstart Desktop]$ sudo hadoop jar checkjar.jar hdfs://quickstart.cloudera:8020/user/cloudera/input.csv hdfs://quickstart.cloudera:8020/user/cloudera/output9
15/04/28 22:09:06 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/04/28 22:09:07 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/04/28 22:09:08 INFO input.FileInputFormat: Total input paths to process : 1
15/04/28 22:09:09 INFO mapreduce.JobSubmitter: number of splits:1
15/04/28 22:09:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1430279123629_0011
15/04/28 22:09:10 INFO impl.YarnClientImpl: Submitted application application_1430279123629_0011
15/04/28 22:09:10 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1430279123629_0011/
15/04/28 22:09:10 INFO mapreduce.Job: Running job: job_1430279123629_0011
15/04/28 22:09:22 INFO mapreduce.Job: Job job_1430279123629_0011 running in uber mode : false
15/04/28 22:09:22 INFO mapreduce.Job: map 0% reduce 0%
15/04/28 22:09:32 INFO mapreduce.Job: map 100% reduce 0%
15/04/28 22:09:46 INFO mapreduce.Job: map 100% reduce 100%
15/04/28 22:09:46 INFO mapreduce.Job: Job job_1430279123629_0011 completed successfully
15/04/28 22:09:46 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=265
FILE: Number of bytes written=211403
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=365
HDFS: Number of bytes written=0
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=8175
Total time spent by all reduces in occupied slots (ms)=10124
Total time spent by all map tasks (ms)=8175
Total time spent by all reduce tasks (ms)=10124
Total vcore-seconds taken by all map tasks=8175
Total vcore-seconds taken by all reduce tasks=10124
Total megabyte-seconds taken by all map tasks=8371200
Total megabyte-seconds taken by all reduce tasks=10366976
Map-Reduce Framework
Map input records=12
Map output records=12
Map output bytes=235
Map output materialized bytes=265
Input split bytes=120
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=265
Reduce input records=12
Reduce output records=0
Spilled Records=24
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=172
CPU time spent (ms)=1150
Physical memory (bytes) snapshot=346574848
Virtual memory (bytes) snapshot=1705988096
Total committed heap usage (bytes)=196481024
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=245
File Output Format Counters
Bytes Written=0
Reducer.java
package com.mapreduce.assgn4;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class JoinReducer
extends Reducer<Text, Text, Text, Text> {
@Override
public void reduce(Text key, Iterable<Text> values,
Context context)
throws IOException, InterruptedException {
List<String> tableoneTuples = new ArrayList<String>();
List<String> tabletwoTuples = new ArrayList<String>();
for (Text value : values) {
String[] splitValues = value.toString().split("#");
String tableName = splitValues[0];
if(tableName.equals(JoinMapper.tableone))
{
tableoneTuples.add(splitValues[1]);
}
else
{
tabletwoTuples.add(splitValues[1]);
}
}
System.out.println(tableoneTuples.size());
System.out.println(tabletwoTuples.size());
String FinaljoinString = null;
for(String tableoneValue: tableoneTuples)
{
for (String tabletwoValue: tabletwoTuples)
{
FinaljoinString = tableoneValue+","+tabletwoValue;
FinaljoinString = key.toString()+","+FinaljoinString;
context.write(null, new Text(FinaljoinString));
}
}
}
}
答案 0 :(得分:2)
你的reducer中的context.write有bug。您需要让NullWritable在输出中具有null,
context.write(NullWritable, new Text(FinaljoinString));