我使用Scala的Avro Java API,并想知道是否有一种简单的编程方法可以使用Avro GenericRecord / SchemaBuilder API将字段添加到现有记录模式中?
答案 0 :(得分:3)
没有简单的方法 - 但我确切地知道你要做什么。
这是一个动态扩展现有架构(例如SchemaBuilder)的示例。
Schema schema = SchemaBuilder
.record("schema_base").namespace("com.namespace.test")
.fields()
.name("longField").type().longType().noDefault()
.name("stringField").type().stringType().noDefault()
.name("booleanField").type().booleanType().noDefault()
.name("optionalStringColumn").type().optional().stringType()
.endRecord();
List<Schema.Field> field_list = schema.getFields();
ArrayList<Schema.Field> new_list = new ArrayList();
//create a new "empty" schema
//public static Schema createRecord(String name, String doc, String namespace, boolean isError) {
Schema s2 = Schema.createRecord("new_schema", "info", "com.namespace.test", false);
//add existing fields
for(Schema.Field f : field_list) {
//f.schema() here is really type "schema" like long or string, not a link back to a custom schema
Schema.Field ff = new Schema.Field(f.name(), f.schema(), f.doc(), f.defaultVal());
new_list.add(ff);
}
//this here is just to show how to create an optional string, its a union of null and string types
ArrayList<Schema> optionalString = new ArrayList<>();
optionalString.add(Schema.create(Schema.Type.NULL));
optionalString.add(Schema.create(Schema.Type.STRING));
//add the new 3 test fields in as optional string types
//default value here appears arbitrary, when you write the record if its not optional it doesn't //pick up default value
String[] sArray = {"test", "test2", "test3"};
for(String s : sArray) {
Schema.Field f = new Schema.Field( s, Schema.createUnion(optionalString), s, "null");
new_list.add(f);
}
s2.setFields(new_list);
您不能只在现有架构上设置字段,因为一旦存在,架构就会被锁定。
注意:请注意默认值 - 如果类型不匹配,一切都会写得很好,但您无法读取avro文件!
答案 1 :(得分:1)
答案相同,但编码格式不同
@tmx提供了完整的答案。创建架构后,所有内容均被锁定。唯一的方法是实现复制方法。这是一个更紧凑的版本:
// Start with a base schema
Schema base = ...;
// Get a copy of base schema's fields.
// Once a field is used in a schema, it gets a position.
// We can't recycle a field and it will throw an exception.
// Hence, we need a fresh field from each field of the old schema
List<Schema.Field> baseFields = base.getFields().stream()
.map(field -> new Schema.Field(field.name(), field.schema(), field.doc(), field.defaultVal()))
.collect(Collectors.toList());
// Add your field
baseFields.add(new Schema.Field("Name", newFieldSchema));
Schema newSchema = Schema.createRecord(
base.getName(),
"New schema by adding a new field",
"com.my.name.space",
false,
baseFields);
具有baseFields
的情况下,您可以进行任何修改,添加/删除/修改。
答案 2 :(得分:0)
如果可以,请不要忘记添加别名
List<Schema.Field> baseFields = base.getFields().stream()
.map(field -> {
Schema.Field f = new Schema.Field(field.name(), field.schema(), field.doc(), field.defaultVal());
field.aliases().forEach(f::addAlias);
return f;
})
.collect(Collectors.toList());