我需要在java中使用mapreduce代码来找出hadoop中的双字计数解决方案
输入:
"你叫什么名字?你想从我这儿得到什么 ? 你知道最好的赚钱方式是Hardwork 你的目标是什么?"
Double W.C.输出: 什么是2 是你的2 你的名字1 你是什么1
更快的反应非常明显。
提前感谢。
答案 0 :(得分:2)
以下代码适用于我。
package hadoop;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class doubleWc {
public static class doubMapper extends Mapper<LongWritable,Text,Text,IntWritable>
{
Text outkey=new Text();
IntWritable outvalue=new IntWritable();
public void map(LongWritable key,Text values,Context context) throws IOException, InterruptedException
{
String []cols=values.toString().split(",");
for(int i=0;i<(cols.length) - 1 ;i++)
{
outkey.set(cols[i]+","+cols[i+1]);
outvalue.set(1);
context.write(outkey, outvalue);
}
}
}
public static class douReducer extends Reducer<Text,IntWritable,Text,IntWritable>
{
IntWritable outvalue=new IntWritable();
public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException
{
int sum=0;
for(IntWritable t:values)
{
sum=sum+t.get();
}
outvalue.set(sum);
context.write(key, outvalue);
}
}
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf=new Configuration();
@SuppressWarnings("deprecation")
Job job=new Job(conf,"double program");
job.setJarByClass(doubleWc.class);
job.setMapperClass(doubMapper.class);
job.setReducerClass(douReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true)?1:0);
}
}
请让我知道它是否有帮助!!!!