使用Map <String,Object>字段在Avro中反序列化对象会返回错误类的值

时间:2019-12-03 15:11:12

标签: java avro

尝试在Apache Avro中序列化包含Map实例的对象,并且正在反序列化Map的字符串键,但将值反序列化为Object类。

可以将GenericDatumWriterGenericData.Record实例一起使用,并将其属性复制到其中,但是需要直接序列化这些对象,而不必将Map属性复制到一个临时对象中以进行序列化。 / p>

public void test1() {

    TimeDot dot = new TimeDot();
    dot.lat = 12;
    dot.lon = 34;
    dot.putProperty("id", 1234);
    dot.putProperty("s", "foo");
    System.out.println("BEFORE: " + dot);

    // serialize
    ReflectDatumWriter<TimeDot> reflectDatumWriter = new ReflectDatumWriter<>(TimeDot.class);
    Schema schema = ReflectData.get().getSchema(TimeDot.class);
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    DataFileWriter<TimeDot> writer = new DataFileWriter<>(reflectDatumWriter).create(schema, out);
    writer.append(dot);
    writer.close();

    // deserialize
    ReflectDatumReader<TimeDot> reflectDatumReader = new ReflectDatumReader<>(TimeDot.class);
    ByteArrayInputStream inputStream = new ByteArrayInputStream(out.toByteArray());
    DataFileStream<TimeDot> reader = new DataFileStream<>(inputStream, reflectDatumReader);
    Object dot2 = reader.next();
    reader.close();
    System.out.println("AFTER: " + dot2);
}

public static class TimeDot {
    Map<String, Object> props = new LinkedHashMap<>();
    double lat;
    double lon;

    public void putProperty(String key, Object value) {
        props.put(key, value);
    }

    public String toString() {
        return "lat="+ lat +", lon="+ lon +", props="+props;
    }
}

输出:

 BEFORE: lat=12.0, lon=34.0, props={id=1234, s=foo}

 AFTER:  lat=12.0, lon=34.0, props={id=java.lang.Object@2b9627bc, s=java.lang.Object@65e2dbf3}

下一步尝试手动创建模式,但是无法序列化。

  

线程“主”中的异常java.lang.NullPointerException:在TimeDot中   字段中map的java.lang.Object中的null   TimeDot的道具

public void test2() throws IOException {        

    TimeDot dot = new TimeDot();
    dot.lat = 12;
    dot.lon = 34;
    dot.putProperty("id", 1234);
    dot.putProperty("s", "foo");
    System.out.println(dot);

    // create Schema
    List<Schema.Field> propFields = new ArrayList<>();
    propFields.add(new Schema.Field("id", Schema.create(Schema.Type.INT)));
    propFields.add(new Schema.Field("s", Schema.create(Schema.Type.STRING)));
    Schema propRecSchema = Schema.createRecord("Object",null,"java.lang",false,propFields);
    Schema propSchema = Schema.createMap(propRecSchema);
    List<Schema.Field> fields = new ArrayList<>(3);
    fields.add(new Schema.Field("lat", Schema.create(Schema.Type.DOUBLE)));
    fields.add(new Schema.Field("lon", Schema.create(Schema.Type.DOUBLE)));
    fields.add(new Schema.Field("props", propSchema));
    Schema schema = Schema.createRecord("TimeDot", null, "", false, fields);
    System.out.println("\nschema:\n" + schema);

    // serialize
    ReflectDatumWriter<TimeDot> reflectDatumWriter = new ReflectDatumWriter<>(TimeDot.class);
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    DataFileWriter<TimeDot> writer = new DataFileWriter<>(reflectDatumWriter).create(schema, out);
    writer.append(dot); // *** fails here > NullPointerException ***
    writer.close();

    // deserialize
    ReflectDatumReader<TimeDot> reader = new ReflectDatumReader<>(schema);
    TimeDot dot2 = reader.read(null,
            DecoderFactory.get().binaryDecoder(out.toByteArray(), null));
    System.out.println(dot2);
}

2 个答案:

答案 0 :(得分:1)

我认为最简单的方法是添加注释

@org.apache.avro.reflect.AvroSchema("{\"type\": \"map\", \"values\": [\"string\", \"int\"]}")
Map<String, Object> props = new LinkedHashMap<>();

答案 1 :(得分:0)

要序列化包含Map的对象,必须在Avro模式中定义带有所有可能类型的值列表的Union。

重要提示:如果未正确设置名称空间,则反序列化将返回GenericData.Record而不是TimeDot类实例。

    List<Schema.Field> fields = new ArrayList<>();
    fields.add(new Schema.Field("lat", Schema.create(Schema.Type.DOUBLE)));
    fields.add(new Schema.Field("lon", Schema.create(Schema.Type.DOUBLE)));
    fields.add(new Schema.Field("props", Schema.createMap(
            Schema.createUnion(Arrays.asList(
                Schema.create(Schema.Type.INT),
                Schema.create(Schema.Type.STRING))))));

    Schema schema = Schema.createRecord("TimeDot", null, "TestAvroUnion", false, fields);

    TimeDot dot = new TimeDot();
    dot.lat = 12;
    dot.lon = 34;
    dot.putProperty("id", 1234);
    dot.putProperty("s", "foo");
    System.out.println("BEFORE: " + dot);

    // serialize
    ReflectDatumWriter<TimeDot> reflectDatumWriter = new ReflectDatumWriter<>(schema);
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    DataFileWriter<TimeDot> dataWriter = new DataFileWriter<>(reflectDatumWriter);
    dataWriter.create(schema, out);
    dataWriter.append(dot);
    dataWriter.close();

    // deserialize
    ReflectDatumReader<TimeDot> reflectDatumReader = new ReflectDatumReader<>(schema);
    try(
        ByteArrayInputStream bis = new ByteArrayInputStream(out.toByteArray());
        DataFileStream<TimeDot> reader = new DataFileStream<>(bis, reflectDatumReader)
    ) {
        TimeDot dot2 = reader.next();
        System.out.println("AFTER:  " + dot2);
    }
}

输出如下:

 BEFORE: lat=12.0, lon=34.0, props={id=1234, s=foo}
 AFTER:  lat=12.0, lon=34.0, props={id=1234, s=foo}

或者使用SchemaBuilder创建架构:

 Schema schema = SchemaBuilder
            .record("TimeDot")
            .namespace("TestUnion")
            .fields()
            .name("lat")
                .type().doubleType()
                .noDefault()
            .name("lon")
                .type().doubleType()
                .noDefault()
            .name("props")
                .type().map()
                    .values(SchemaBuilder.unionOf().intType().and().stringType().endUnion())
                .noDefault()
            .endRecord();