在单个地图功能中精心设计多行

时间:2014-09-28 14:56:13

标签: java hadoop mapreduce

我正在研究hadoop,我希望每个地图功能都能在多行上运行。我发现我可以使用属性mapreduce.input.lineinputformat.linespermap,但是如果我理解了它,我可以指定单个映射器的行数而不是每个映射函数。我怎样才能做到这一点?提前谢谢。

1 个答案:

答案 0 :(得分:0)

1)您必须编写自定义文本格式。

2)您必须为此创建自己的自定义记录阅读器,您将在其中实现逻辑。

You will extend from  TextInputFormat class to create your own NLinesInputFormat .
You will also create your own RecordReader class called NLinesRecordReader where you will implement the logic of feeding 3 lines/records at a time.
You will make a change in our driver program to use our new NLinesInputFormat class.
please follow the link for complete details :

请按照以下链接获取详细方法: http://bigdatacircus.com/2012/08/01/wordcount-with-custom-record-reader-of-textinputformat/