我有一个输入文件,第一列表示专利,第二列表示专利,如下所示:
1 1.232
1 1.45
1 1.153
1 1.100
2 2.179
2 2.206
2 2.59
3 3.233
3 3.171
3 3.197
3 3.40
3 3.201
我必须计算与每个专利相关的子专利数量,我尝试将mapper输入作为字符串,但是它抛出错误。我的代码如下:
public static class Map extends Mapper<LongWritable,Text,Text,IntWritable>{
private final static IntWritable one = new IntWritable(1);
//private IntWritable word = new IntWritable();
public void map(LongWritable key, Text value,Context context)throws IOException,InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
value.set(tokenizer.nextToken());
context.write(value, one);
}
}
}
public static class Reduce extends Reducer<Text,IntWritable,IntWritable,IntWritable>{
public void reduce(Text key, Iterable<IntWritable> values,Context context)
throws IOException,InterruptedException {
int sum=0;
// TODO Auto-generated method stub
for(IntWritable x: values)
{
sum+=x.get();//sum = sum + x.get();
}
context.write(key, new IntWritable(sum));
}
}
请提供帮助。我的输出应如下所示:
Patent Number of Associated Sub-patents
1 4
2 3
3 5
谢谢, Sahitya