应用错误收集

我有一个包含5条记录的avro文件。我知道你可以编写一个mapreduce作业来遍历每条记录，但是我的java mapreduce作业映射器中有一种方法可以获得＆＃34; length＆＃34;文件中的avro，以便我可以得到：

1) the starting position of each record as they are processed.
2) the length of the record as it exists in the file, such that I can use java code to "seek" to the start of a specific avro record within the file (i.e. 4th record).

如果根据当前的Avro库无法做到这一点，那很好。

用例是我希望能够输出包含以下内容的文件：

<Record Number> <StartIndex> <EndIndex>
Record1 0 150
Record2 151 270
...

Mapreduce个人avro记录长度？

0 个答案: