Question

我已重命名了用Avro序列化的记录中的字段。我想支持在没有架构注册表的情况下读取数据的旧版本。因此，我将所有版本的架构保留为从类路径加载的资源。

这很好用，并支持架构演变。当它们向后兼容时，我可以读取使用旧模式序列化的数据。作为确保这一点的一部分，我想在应用程序启动时验证架构。不幸的是，即使在解码数据时，架构验证也不支持字段别名。

这是一个简单的例子，证明了我的观点：

import java.util.Collections;

import org.apache.avro.Schema;
import org.apache.avro.SchemaBuilder;
import org.apache.avro.SchemaValidationException;
import org.apache.avro.SchemaValidatorBuilder;


public class Bar {
    public static void main(String[] args) throws SchemaValidationException {
        Schema stringType = SchemaBuilder.builder().stringType();
        Schema s1 = SchemaBuilder.builder().record("foo").fields()
                .name("test1").type(stringType).noDefault()
                .endRecord();
        Schema s2 = SchemaBuilder.builder().record("foo").fields()
                .name("test2").aliases("test1").type(stringType).noDefault()
                .endRecord();

        new SchemaValidatorBuilder().canReadStrategy().validateLatest().validate(s2, Collections.singleton(s1));

    }
}

这将引发以下异常：

Exception in thread "main" org.apache.avro.SchemaValidationException: Unable to read schema: 
{
  "type" : "record",
  "name" : "foo",
  "fields" : [ {
    "name" : "test1",
    "type" : "string"
  } ]
}
using schema:
{
  "type" : "record",
  "name" : "foo",
  "fields" : [ {
    "name" : "test2",
    "type" : "string",
    "aliases" : [ "test1" ]
  } ]
}
    at org.apache.avro.ValidateMutualRead.canRead(ValidateMutualRead.java:70)
    at org.apache.avro.ValidateCanRead.validate(ValidateCanRead.java:40)
    at org.apache.avro.ValidateLatest.validate(ValidateLatest.java:51)
    at Bar.main(Bar.java:18)

Answer 1

很抱歉回答我自己的问题：

我在Arvo用户邮件列表中发现了此问题的变体，但未得到回答。 Different behavior between SchemaValidator and SchemaCompatibility regarding aliased field

在我看来SchemaValidator有一个错误，但我不明白为什么会有 SchemaValidator和SchemaCompatibility，所以感觉就像我丢失了东西。

简而言之，用SchemaCompatibility.checkReaderWriterCompatibility代替SchemaValidatorBuilder可以使它看起来更完整，并且可以重新使用解码逻辑。

如何获得Avro模式验证以支持字段别名？

1 个答案: