Avro架构不支持向后兼容性

时间:2016-01-12 00:26:50

标签: java serialization avro

我有这个avro架构

{
 "namespace": "xx.xxxx.xxxxx.xxxxx",
 "type": "record",
 "name": "MyPayLoad",
 "fields": [
     {"name": "filed1",  "type": "string"},
     {"name": "filed2",     "type": "long"},
     {"name": "filed3",  "type": "boolean"},
     {
          "name" : "metrics",
          "type": 
          {
             "type" : "array", 
             "items": 
             { 
                 "name": "MyRecord", 
                 "type": "record", 
                 "fields" : 
                     [                         
                       {"name": "min", "type": "long"}, 
                       {"name": "max", "type": "long"}, 
                       {"name": "sum", "type": "long"}, 
                       {"name": "count", "type": "long"}
                     ]
             } 
          }
     }
  ]
}

以下是我们用来解析数据的代码

public static final MyPayLoad parseBinaryPayload(byte[] payload) {
        DatumReader<MyPayLoad> payloadReader = new SpecificDatumReader<>(MyPayLoad.class);
        Decoder decoder = DecoderFactory.get().binaryDecoder(payload, null);
        MyPayLoad myPayLoad = null;
        try {
            myPayLoad = payloadReader.read(null, decoder);
        } catch (IOException e) {
            logger.log(Level.SEVERE, e.getMessage(), e);
        }

        return myPayLoad;
    }

现在我想在架构中添加一个字段,以便架构如下所示

 {
 "namespace": "xx.xxxx.xxxxx.xxxxx",
 "type": "record",
 "name": "MyPayLoad",
 "fields": [
     {"name": "filed1",  "type": "string"},
     {"name": "filed2",     "type": "long"},
     {"name": "filed3",  "type": "boolean"},
     {
          "name" : "metrics",
          "type": 
          {
             "type" : "array", 
             "items": 
             { 
                 "name": "MyRecord", 
                 "type": "record", 
                 "fields" : 
                     [                         
                       {"name": "min", "type": "long"}, 
                       {"name": "max", "type": "long"}, 
                       {"name": "sum", "type": "long"}, 
                       {"name": "count", "type": "long"}
                     ]
             } 
          }
     }
     {"name": "agentType",  "type": ["null", "string"], "default": "APP_AGENT"}
  ]
}

请注意已添加的字段,并且还定义了默认值。问题是,如果我们收到使用旧架构编写的数据,我会收到此错误

java.io.EOFException: null
    at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) ~[avro-1.7.4.jar:1.7.4]
    at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) ~[avro-1.7.4.jar:1.7.4]
    at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423) ~[avro-1.7.4.jar:1.7.4]
    at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) ~[avro-1.7.4.jar:1.7.4]
    at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) ~[avro-1.7.4.jar:1.7.4]
    at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) ~[avro-1.7.4.jar:1.7.4]
    at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) ~[avro-1.7.4.jar:1.7.4]
    at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177) ~[avro-1.7.4.jar:1.7.4]
    at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) ~[avro-1.7.4.jar:1.7.4]
    at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) ~[avro-1.7.4.jar:1.7.4]
    at com.appdynamics.blitz.shared.util.XXXXXXXXXXXXX.parseBinaryPayload(BlitzAvroSharedUtil.java:38) ~[blitz-shared.jar:na]

我从this文件中了解到,这应该是向后兼容的,但似乎并不是这样。知道我做错了吗?

3 个答案:

答案 0 :(得分:2)

最后我得到了这个工作。我需要在SpecificDatumReader中给出两个模式 所以我修改了这样的解析,我在读者中传递了新旧架构,它就像一个魅力

{{1}}

答案 1 :(得分:0)

我可以在您的架构中看到两个可能的问题

  1. 我的默认值似乎总是为null 要指定此项,您需要设置
  2. "default": null

    1. 同样在你的架构中,你忘了在数组和新字段之间添加一个(字段分隔符)。因此,请尝试将模式更改为
    2. { "namespace": "xx.xxxx.xxxxx.xxxxx", "type": "record", "name": "MyPayLoad", "fields": [ {"name": "filed1", "type": "string"}, {"name": "filed2", "type": "long"}, {"name": "filed3", "type": "boolean"}, { "name" : "metrics", "type": { "type" : "array", "items": { "name": "MyRecord", "type": "record", "fields" : [ {"name": "min", "type": "long"}, {"name": "max", "type": "long"}, {"name": "sum", "type": "long"}, {"name": "count", "type": "long"} ] } } }, {"name": "agentType", "type": ["null", "string"], "default":null} ] }

答案 2 :(得分:0)

我正面临着这种情况。尝试使用较新的架构读取时,旧架构写入的数据会失败。较新的模式只有一个带有union和default set的附加字段。 “type”:[“null”,“string”],“doc”:“”,“default”:null

尽管设置了默认值,但在读取期间不会自动填充空值。在阅读期间需要提供作者和读者模式。我的理解是avro是向后兼容的,它应该能够支持更新的列,而不需要旧的模式。