在Hadoop字数统计示例中,IntWritable是静态的,因此可以在同一个JVM中重用它,而不是创建新的。我的问题是为什么不让文字也是静态的? 我做到了,工作正常但从未在任何例子中看到过。我错过了什么吗?
private ***static*** Text word = new Text();
private final static IntWritable intWritable = new IntWritable(1);
原始字数统计示例。
public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
答案 0 :(得分:0)
OutputCollector API ,收集Mappers和Reducers输出的对,为了使程序正常工作,决定变量是否必须是全局的,基于你的逻辑和应用程序逻辑类型试图解决,在WordCount程序的情况下,程序正常工作,因为mapper对象不是跨多个线程共享它的状态