我正在尝试在Scala中编写一个Hadoop映射器类。作为一个起点,我从书中找到了一个Java示例" Hadoop:the Definitive Guide"并试图将它移植到Scala。
原始Java类扩展org.apache.hadoop.mapreduce.Mapper
:
public class MaxTemperatureMapper
extends Mapper<LongWritable, Text, Text, IntWritable>
并覆盖方法
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException
此方法被调用并正常工作(我使用单元测试进行测试,然后使用纱线运行)。
我在Scala端口的尝试是:
class MaxTemperatureMapperS extends Mapper[LongWritable, Text, Text, IntWritable]
然后是方法
@throws(classOf[IOException])
@throws(classOf[InterruptedException])
override def map(key: LongWritable, value: Text, context: Context): Unit =
{
...
}
但Scala编译器发出错误:
error: method map overrides nothing.
所以我认为这两种方法在Scala和Java中具有相同的签名,但显然我遗漏了一些东西。你能给我一些提示吗?
答案 0 :(得分:5)
有时,最好的方法是让IDE为您服务:
class Test extends Mapper[LongWritable, Text, Text, IntWritable] {
override def map(key: LongWritable, value: Text, context: Mapper[LongWritable, Text, Text, IntWritable]#Context): Unit = ???
}
在这种情况下,问题是类Context的定义&#34;生活&#34;在类Mapper
内,所以你需要使用#syntax
答案 1 :(得分:0)
作为参考,提供了分别覆盖scala的Mapper和Reducer类中的map和reduce方法的代码
映射器示例:
class MaxTemperatureMapper extends Mapper[LongWritable, Text, AvroKey[Integer], AvroValue[GenericRecord]] {
val parser = new NcdcRecordParser()
val record = new GenericData.Record(AvroSchema.SCHEMA)
@throws(classOf[IOException])
@throws(classOf[InterruptedException])
override def map(key: LongWritable, value: Text, context:Mapper[LongWritable, Text, AvroKey[Integer], AvroValue[GenericRecord]]#Context) = {
减速器示例:
class MaxTemperatureReducer extends Reducer[AvroKey[Integer],AvroValue[GenericRecord],AvroKey[GenericRecord],NullWritable]{
@throws(classOf[IOException])
@throws(classOf[InterruptedException])
override def reduce(key:AvroKey[Integer],values:java.lang.Iterable[AvroValue[GenericRecord]],
context:Reducer[AvroKey[Integer],AvroValue[GenericRecord],AvroKey[GenericRecord],NullWritable]#Context) = {