Mapreduce个人avro记录长度?

时间:2015-09-02 03:48:32

标签: java avro

我有一个包含5条记录的avro文件。我知道你可以编写一个mapreduce作业来遍历每条记录,但是我的java mapreduce作业映射器中有一种方法可以获得" length"文件中的avro,以便我可以得到:

1) the starting position of each record as they are processed.
2) the length of the record as it exists in the file, such that I can use java code to "seek" to the start of a specific avro record within the file (i.e. 4th record).

如果根据当前的Avro库无法做到这一点,那很好。

用例是我希望能够输出包含以下内容的文件:

<Record Number> <StartIndex> <EndIndex>
Record1 0 150
Record2 151 270
...

0 个答案:

没有答案