(免责声明:我是Hadoop和Java的新手)
作为输入,有一个具有简单键值结构的表:
key1 value1
key2 value2
key3 value3
key2 value4
key1 value5
key1 value6
作为输出,我想为每个键收集属于特定键的所有值,如下所示:
key1, value1 value5 value6
key2, value2 value4
key3, value3
这是我的映射器:
public class WordMapper extends Mapper<Object, Text, Text, Text> {
@Override
public void map(Object key, Text value,
Context context) throws IOException, InterruptedException {
String[] fields = value.toString().split("\\t", -1);
for (int i = 0; i < fields.length; ++i) {
if ("".equals(fields[i])) fields[i] = null;
}
List<String> fields_list = Arrays.asList(fields);
Text textKey = new Text(fields_list.get(0));
Text textValue = new Text(fields_list.get(1));
context.write(textKey,textValue);
}
}
这是减速器:
public class SumReducer extends Reducer<Text, TextArrayWritable, Text, TextArrayWritable> {
private TextArrayWritable valuesTotal = new TextArrayWritable();
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
ArrayList<Text> values_list = new ArrayList<Text>();
for (Text value : values) {
values_list.add(value);
}
Text[] values_arr = new Text[values_list.size()];
values_arr = values_list.toArray(values_arr);
valuesTotal.setFields(values_arr);
context.write(key, valuesTotal);
}
}
出于某种原因,我无法从我的程序中获得任何输出。它只是终止,在输出文件夹中什么都不留。我的问题在这里是什么?
(我使用Hadoop 2.2.0和Eclipse + hadoop插件.WordCount示例运行没有问题。)
答案 0 :(得分:1)
问题解决了。在我启用日志记录后,很明显我的数据包含第4列中缺少值的行,所以我添加了空检查if (fields[4] != null)
并且它有效。此外,我摆脱数组列出转换和TextArrayWritable自定义类
Mapper:
@Override
public void map(Object key, Text value,
Context context) throws IOException, InterruptedException {
String[] fields = value.toString().split("\\t", -1);
for (int i = 0; i < fields.length; ++i) {
if ("".equals(fields[i])) fields[i] = null;
}
if (fields[4] != null) {
System.out.println(fields[0]);
System.out.println(fields[4]);
context.write(new Text(fields[0]),new Text(fields[4]));
}
}
}
减速机:
public class SongsReducer extends Reducer<Text, Text, Text, Text> {
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
boolean first = true;
StringBuilder songs = new StringBuilder();;
for (Text val : values){
if (!first)
songs.append(",");
first=false;
songs.append(val.toString());
}
context.write(key, new Text(songs.toString()));
}
}