我正在寻找mapreduce程序从一个hive表读取并写入每个记录的第一列值的hdfs位置。它应该只包含地图阶段而不是减速阶段。
以下是映射器
public class Map extends Mapper<WritableComparable, HCatRecord, NullWritable, IntWritable> {
protected void map( WritableComparable key,
HCatRecord value,
org.apache.hadoop.mapreduce.Mapper<WritableComparable, HCatRecord,
NullWritable, IntWritable>.Context context)
throws IOException, InterruptedException {
// The group table from /etc/group has name, 'x', id
// groupname = (String) value.get(0);
int id = (Integer) value.get(1);
// Just select and emit the name and ID
context.write(null, new IntWritable(id));
}
}
主要课程
public class mapper1 {
public static void main(String[] args) throws Exception {
mapper1 m=new mapper1();
m.run(args);
}
public void run(String[] args) throws IOException, Exception, InterruptedException {
Configuration conf =new Configuration();
// Get the input and output table names as arguments
String inputTableName = args[0];
// Assume the default database
String dbName = "xademo";
Job job = new Job(conf, "UseHCat");
job.setJarByClass(mapper1.class);
HCatInputFormat.setInput(job, dbName, inputTableName);
job.setMapperClass(Map.class);
// An HCatalog record as input
job.setInputFormatClass(HCatInputFormat.class);
// Mapper emits a string as key and an integer as value
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(IntWritable.class);
FileOutputFormat.setOutputPath((JobConf) conf, new Path(args[1]));
job.waitForCompletion(true);
}
}
此代码有什么问题吗?
这是因为字符串5s中的Numberformat异常而产生一些错误。我不确定它是从哪里拿走的。在线下方显示错误HCatInputFormat.setInput()