写入Avro数据文件

时间:2011-04-04 23:29:43

标签: file avro eofexception

以下代码只是将数据写入avro格式,并从写入的avro文件中读取和显示相同的内容。我只是在Hadoop权威指南中尝试了这个例子。我能够第一次执行此操作。然后我收到以下错误。它确实是第一次工作。所以我不确定我做错了。

这是一个例外:

Exception in thread "main" java.io.EOFException: No content to map to Object due to end of input
    at org.codehaus.jackson.map.ObjectMapper._initForReading(ObjectMapper.java:2173)
    at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:2106)
    at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:1065)
    at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:1040)
    at org.apache.avro.Schema.parse(Schema.java:895)
    at org.avro.example.SimpleAvro.AvroExample.avrocreate(AvroDataExample.java:23)
    at org.avro.example.SimpleAvro.AvroDataExample.main(AvroDataExample.java:55)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

这是代码:

package org.avro.example.SimpleAvro;

import java.io.File;
import java.io.IOException;

import org.apache.avro.Schema;
import org.apache.avro.file.DataFileReader;
import org.apache.avro.file.DataFileWriter;
import org.apache.avro.generic.GenericData;
import org.apache.avro. generic.GenericDatumReader;
import org.apache.avro.generic.GenericDatumWriter;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.io.DatumReader;
import org.apache.avro.io.DatumWriter;

class AvroExample{

    AvroExample(){

    }
    void avrocreate() throws Exception{

        Schema schema=Schema.parse(getClass().getResourceAsStream("Pair.avsc"));

        GenericRecord datum=new GenericData.Record(schema);
        datum.put("left", "L");
        datum.put("right", "R");

        File file=new File("data.avro");
        DatumWriter<GenericRecord> writer=new GenericDatumWriter<GenericRecord>(schema);
        DataFileWriter<GenericRecord> dataFileWriter=new DataFileWriter<GenericRecord>(writer);
        dataFileWriter.create(schema, file);
        dataFileWriter.append(datum);
        dataFileWriter.close();

        System.out.println("Written to avro data file");
        //reading from the avro data file

        DatumReader<GenericRecord> reader= new GenericDatumReader<GenericRecord>();
        DataFileReader<GenericRecord> dataFileReader=new DataFileReader<GenericRecord>(file,reader);
        GenericRecord result=dataFileReader.next();
        System.out.println("data" + result.get("left").toString());

        result=dataFileReader.next();
        System.out.println("data :" + result.get("left").toString());


    }

}
public class AvroDataExample {
    public static void main(String args[])throws Exception{

        AvroExample a=new AvroExample();
        a.avrocreate();
    }



}

以下是Pair.avsc文件[在本书的示例代码中给出]

{
  "type": "record",
  "name": "Pair",
  "doc": "A pair of strings.",
  "fields": [
    {"name": "left", "type": "string"},
    {"name": "right", "type": "string"}
  ]
}

4 个答案:

答案 0 :(得分:3)

您可能无法正确读取架构文件。我怀疑这是问题所在,因为堆栈跟踪显示它无法解析模式:

Exception in thread "main" java.io.EOFException: No content to map to Object due to end of input
    at org.codehaus.jackson.map.ObjectMapper._initForReading(ObjectMapper.java:2173)
    at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:2106)
    at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:1065)
    at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:1040)
    at org.apache.avro.Schema.parse(Schema.java:895)

从“资源”中读取文件充满了问题,除非您的环境设置得恰到好处。此外,由于您之前提到它曾经工作过一次,您可能只是为第二次运行更改了一些环境设置(例如工作目录)。

尝试将模式字符串复制粘贴到String变量中并直接parse,而不是使用资源加载器:

String schemaJson = "paste schema here (and fix quotes)";
Schema schema = Schema.parse(schemaJson);
GenericRecord datum = new GenericData.Record(schema);
...

答案 1 :(得分:1)

    GenericRecord result=dataFileReader.next();
    System.out.println("data" + result.get("left").toString());
    result=dataFileReader.next();
    System.out.println("data :" + result.get("left").toString());

我想这就是你出错的地方。

您应该调用记录的“left”属性和“right”属性。

试试吧。

它对我有用。

答案 2 :(得分:0)

答案 3 :(得分:0)

如果文件位于jar的根目录,请在文件名前加斜杠。

Schema.parse(getClass().getResourceAsStream("/Pair.avsc"));