我编写了一个Driver,Mapper和Reducer程序来尝试复合键(输入数据集中的多个字段)。
数据集如下所示:
国家,州,县,人口(百万)
USA,CA,Alameda的12
USA,CA,Santa Clara,14
USA,AZ,Abajd,14
我试图找出国家+州的总人口。 因此,reducer应该聚合在两个字段Country + State上并显示总体。
当我在步骤(在reducer代码中)通过填充迭代时
for(IntWritable i:values)
我收到编译器错误“只能迭代数组或java.lang.Iterable的实例”
所以我们无法在IntWritable上获取迭代器?我能够让Iterator在FloatWritable数据类型上工作。
非常感谢 纳特
import java.io.DataInput;
import java.io.DataOutput;
import java.io.File;
import java.io.IOException;
import java.util.Iterator;
import org.apache.commons.io.FileUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.WritableComparable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class CompositeKeyReducer extends Reducer<Country, IntWritable, Country, FloatWritable> {
// public class CompositeKeyReducer extends Reducer<Country, IntWritable, Country, IntWritable> {
public void reduce(Country key, Iterator<IntWritable> values, Context context) throws IOException, InterruptedException {
int numberofelements = 0;
int cnt = 0;
while (values.hasNext()) {
cnt = cnt + values.next().get();
}
//USA, Alameda = 10
//USA, Santa Clara = 12
//USA, Sacramento = 12
float populationinmillions =0;
for(IntWritable i:values)
{
populationinmillions = populationinmillions + i.get();
numberofelements = numberofelements+1;
}
// context.write(key, new IntWritable(cnt));
context.write(key, new FloatWritable(populationinmillions));
}
}