Avro的新手,我有以下架构id.avsc。我正在使用scala来尝试填充记录,我的理解是这是avro中的联合,但我不知道如何填充联合中的各个记录。 scala或java welcome中的任何建议:
{
"type" : "record",
"name" : "mytest",
"namespace" : "risk",
"fields" : [ {
"name" : "id",
"type" : [ {
"type" : "record",
"name" : "myid",
"fields" : [ {
"name" : "myid1",
"type" : "string"
}, {
"name" : "multiids",
"type" : {
"type" : "map",
"values" : "string"
}
} ]
}, {
"type" : "record",
"name" : "yourid",
"fields" : [ {
"name" : "yourid2",
"type" : "string"
}, {
"name" : "multiids",
"type" : {
"type" : "map",
"values" : "string"
}
} ]
}, {
"type" : "record",
"name" : "extraid",
"fields" : [ {
"name" : "name",
"type" : "string"
} ]
} ]
} ]
}
下面的代码给出了一个空指针,猜测因为这些是记录而不是字段(但是看不到getRecord方法),如果我尝试在“id”字段输入,我会收到一条错误,说它“不是记录模式” 。我怎么能这样做,或类似的东西填充记录中这个嵌套联合中的值?
*我已经尝试了几个用于avro的scala库,但它们都缺少了一些东西,没有一个完整的解决方案可以帮助我的架构。
import java.io.File
import org.apache.avro.Schema.Parser
import org.apache.avro.file.{DataFileReader, DataFileWriter}
import org.apache.avro.generic.{GenericData, GenericDatumReader, GenericDatumWriter, GenericRecord}
object Avro extends App {
val idSchema = scala.io.Source.fromFile("id.avsc").mkString
val avroIdSchema = new Parser().parse(idSchema)
val idMessage = new GenericData.Record(avroIdSchema)
val idGenericRecord = new GenericData.Record(avroIdSchema.getField("myid").schema())
idGenericRecord.put("myid1", "1234")
val multiIdsMap = new java.util.HashMap[String,String]
multiIdsMap.put("123" , "1234")
idGenericRecord.put("multiids",multiIdsMap)
idMessage .put("myid", idGenericRecord)
//similar implemention for the other records, but it fails on the first one
}
答案 0 :(得分:0)
如果您使用的是Scala和Avro,您可能会考虑使用某些库(我已成功使用avro4s)来处理样板文件。我想你的代码将等于:
case class MyId(myid1: String, multiids: Map[String,String])
case class YourId(yourid2: String, multiids: Map[String, String])
case class ExtraId(name: String)
case class MyTest(myid: MyIds, yourid: YourId, extraid: ExtraId)
val record = MyTest(
MyId("test", Map("test1" -> "test2")),
...
)
val format = RecordFormat[MyTest]
// record is of type GenericRecord
val genericRecord = format.to(record)
编辑:对不起,我还没有注意到你已经测试了一些图书馆...... Here is some simple example of nested record population in Java。