我正在尝试为我为python插件mongo-hadoop编写的hadoop流工作设置dumbo驱动程序扩展。
dumbo项目需要我使用TypedBytesWritable类。所以我做了一个新的InputFormat& RecordReader就像这样:
package com.mongodb.hadoop;
public class TypedBytesTableInputFormat implements InputFormat<TypedBytesWritable, TypedBytesWritable> {
@Override
public RecordReader<TypedBytesWritable, TypedBytesWritable> getRecordReader(InputSplit split,
JobConf job,
Reporter reporter) {
if (!(split instanceof MongoInputSplit))
throw new IllegalStateException("Creation of a new RecordReader requires a MongoInputSplit instance.");
final MongoInputSplit mis = (MongoInputSplit) split;
//**THE FOLLOWING LINE THROWS THE ERROR**
return (RecordReader<TypedBytesWritable, TypedBytesWritable>) new TypedBytesMongoRecordReader(mis);
}
这是扩展的RecordReader:
package com.mongodb.hadoop.input;
...
...
import org.apache.hadoop.mapreduce.RecordReader;
...
...
public class TypedBytesMongoRecordReader extends RecordReader<TypedBytesWritable, TypedBytesWritable> {
public TypedBytesMongoRecordReader(MongoInputSplit mis) {
_cursor = mis.getCursor();
}
@Override
public void close() {
if ( _cursor != null )
_cursor.close();
}
但是当我运行这个工作时,它会抛出这个错误。我不知道为什么,它是RecordReader的孩子。我究竟做错了什么?这是RecordReader类的API文档。我以为我正在做的一切正确:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/RecordReader.html
我确实在正在转换为RecordReader的行上收到警告,但没有错误,并且它构建了jar就好了。警告:
Type safety: Unchecked cast from TypedBytesMongoRecordReader to RecordReader<TypedBytesWritable,TypedBytesWritable>
答案 0 :(得分:1)
试试这个:
public <T extends RecordReader<TypedBytesWritable, TypedBytesWritable>> T getRecordReader(InputSplit split, JobConf job, Reporter reporter) {
if (!(split instanceof MongoInputSplit))
throw new IllegalStateException("Creation of a new RecordReader requires a MongoInputSplit instance.");
final MongoInputSplit mis = (MongoInputSplit) split;
return new TypedBytesMongoRecordReader(mis); // you may need a cast (T) - try it without first
}