我需要编写一个简单的map-reduce程序,作为输入给出一个表示为边缘列表的有向图,产生相同的图形,其中x> y的每个边(x,y)被替换为(y, x)并且输出图中没有重复的边。
INPUT
1;3
2;1
0;1
3;1
2;0
1;1
2;1
OUTPUT
1;3
1;2
0;1
0;2
1;1
这是代码:
public class ExamGraph {
// mapper class
public static class MyMapper extends Mapper<LongWritable, Text, Text, NullWritable> {
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
value = new Text( value.toString());
String[] campi = value.toString().split(";");
if (Integer.getInteger(campi[0]) > Integer.getInteger(campi[1]))
context.write(new Text(campi[1]+";"+campi[0]), NullWritable.get());
else context.write(new Text(campi[0]+";"+campi[1]), NullWritable.get());
}
}
// reducer class
public static class MyReducer extends Reducer<Text, NullWritable, Text, NullWritable> {
@Override
protected void reduce(Text key, Iterable <NullWritable> values , Context context)
throws IOException, InterruptedException {
context.write(key, NullWritable.get());
}
}
public static void main(String[] args) throws Exception {
// create new job
Job job = Job.getInstance(new Configuration());
// job is based on jar containing this class
job.setJarByClass(ExamGraph.class);
// for logging purposes
job.setJobName("ExamGraph");
// set input path in HDFS
FileInputFormat.addInputPath(job, new Path(args[0]));
// set output path in HDFS (destination must not exist)
FileOutputFormat.setOutputPath(job, new Path(args[1]));
// set mapper and reducer classes
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
// An InputFormat for plain text files.
// Files are broken into lines. Either linefeed or carriage-return are used
// to signal end of line. Keys are the position in the file, and values
// are the line of text.
job.setInputFormatClass(TextInputFormat.class);
// set type of output keys and values for both mappers and reducers
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
// start job
job.waitForCompletion(true);
}
}
当我使用:
运行jar文件时hadoop jar path/jar JOBNAME /inputlocation /outputlocation
我收到了这个错误:
18/05/22 02:13:11 INFO mapreduce.Job: Task Id : attempt_1526979627085_0001_m_000000_1, Status : FAILED
Error: java.lang.NullPointerException
at ExamGraph$MyMapper.map(ExamGraph.java:38)
at ExamGraph$MyMapper.map(ExamGraph.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
但是我没有在代码中找到错误。
答案 0 :(得分:0)
发现问题,我将方法getInteger()
与映射器中的parseInt()
混淆。