我有这个avro架构
{
"namespace": "xx.xxxx.xxxxx.xxxxx",
"type": "record",
"name": "MyPayLoad",
"fields": [
{"name": "filed1", "type": "string"},
{"name": "filed2", "type": "long"},
{"name": "filed3", "type": "boolean"},
{
"name" : "metrics",
"type":
{
"type" : "array",
"items":
{
"name": "MyRecord",
"type": "record",
"fields" :
[
{"name": "min", "type": "long"},
{"name": "max", "type": "long"},
{"name": "sum", "type": "long"},
{"name": "count", "type": "long"}
]
}
}
}
]
}
以下是我们用来解析数据的代码
public static final MyPayLoad parseBinaryPayload(byte[] payload) {
DatumReader<MyPayLoad> payloadReader = new SpecificDatumReader<>(MyPayLoad.class);
Decoder decoder = DecoderFactory.get().binaryDecoder(payload, null);
MyPayLoad myPayLoad = null;
try {
myPayLoad = payloadReader.read(null, decoder);
} catch (IOException e) {
logger.log(Level.SEVERE, e.getMessage(), e);
}
return myPayLoad;
}
现在我想在架构中添加一个字段,以便架构如下所示
{
"namespace": "xx.xxxx.xxxxx.xxxxx",
"type": "record",
"name": "MyPayLoad",
"fields": [
{"name": "filed1", "type": "string"},
{"name": "filed2", "type": "long"},
{"name": "filed3", "type": "boolean"},
{
"name" : "metrics",
"type":
{
"type" : "array",
"items":
{
"name": "MyRecord",
"type": "record",
"fields" :
[
{"name": "min", "type": "long"},
{"name": "max", "type": "long"},
{"name": "sum", "type": "long"},
{"name": "count", "type": "long"}
]
}
}
}
{"name": "agentType", "type": ["null", "string"], "default": "APP_AGENT"}
]
}
请注意已添加的字段,并且还定义了默认值。问题是,如果我们收到使用旧架构编写的数据,我会收到此错误
java.io.EOFException: null
at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) ~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) ~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423) ~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) ~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) ~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) ~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) ~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177) ~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) ~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) ~[avro-1.7.4.jar:1.7.4]
at com.appdynamics.blitz.shared.util.XXXXXXXXXXXXX.parseBinaryPayload(BlitzAvroSharedUtil.java:38) ~[blitz-shared.jar:na]
我从this文件中了解到,这应该是向后兼容的,但似乎并不是这样。知道我做错了吗?
答案 0 :(得分:2)
最后我得到了这个工作。我需要在SpecificDatumReader中给出两个模式 所以我修改了这样的解析,我在读者中传递了新旧架构,它就像一个魅力
{{1}}
答案 1 :(得分:0)
我可以在您的架构中看到两个可能的问题
"default": null
{
"namespace": "xx.xxxx.xxxxx.xxxxx",
"type": "record",
"name": "MyPayLoad",
"fields": [
{"name": "filed1", "type": "string"},
{"name": "filed2", "type": "long"},
{"name": "filed3", "type": "boolean"},
{
"name" : "metrics",
"type":
{
"type" : "array",
"items":
{
"name": "MyRecord",
"type": "record",
"fields" :
[
{"name": "min", "type": "long"},
{"name": "max", "type": "long"},
{"name": "sum", "type": "long"},
{"name": "count", "type": "long"}
]
}
}
},
{"name": "agentType", "type": ["null", "string"], "default":null}
]
}
答案 2 :(得分:0)
我正面临着这种情况。尝试使用较新的架构读取时,旧架构写入的数据会失败。较新的模式只有一个带有union和default set的附加字段。 “type”:[“null”,“string”],“doc”:“”,“default”:null
尽管设置了默认值,但在读取期间不会自动填充空值。在阅读期间需要提供作者和读者模式。我的理解是avro是向后兼容的,它应该能够支持更新的列,而不需要旧的模式。