Question

Avro的新手，我有以下架构id.avsc。我正在使用scala来尝试填充记录，我的理解是这是avro中的联合，但我不知道如何填充联合中的各个记录。 scala或java welcome中的任何建议：

{
  "type" : "record",
  "name" : "mytest",
  "namespace" : "risk",
  "fields" : [ {
    "name" : "id",
    "type" : [ {
      "type" : "record",
      "name" : "myid",
      "fields" : [ {
        "name" : "myid1",
        "type" : "string"
      }, {
        "name" : "multiids",
        "type" : {
          "type" : "map",
          "values" : "string"
        }
      } ]
    }, {
      "type" : "record",
      "name" : "yourid",
      "fields" : [ {
        "name" : "yourid2",
        "type" : "string"
      }, {
        "name" : "multiids",
        "type" : {
          "type" : "map",
          "values" : "string"
        }
      } ]
    }, {
      "type" : "record",
      "name" : "extraid",
      "fields" : [ {
        "name" : "name",
        "type" : "string"
      } ]
    } ]
  } ]
}

下面的代码给出了一个空指针，猜测因为这些是记录而不是字段（但是看不到getRecord方法），如果我尝试在“id”字段输入，我会收到一条错误，说它“不是记录模式” 。我怎么能这样做，或类似的东西填充记录中这个嵌套联合中的值？

*我已经尝试了几个用于avro的scala库，但它们都缺少了一些东西，没有一个完整的解决方案可以帮助我的架构。

import java.io.File

import org.apache.avro.Schema.Parser
import org.apache.avro.file.{DataFileReader, DataFileWriter}
import org.apache.avro.generic.{GenericData, GenericDatumReader, GenericDatumWriter, GenericRecord}


object Avro extends App  {
  val idSchema = scala.io.Source.fromFile("id.avsc").mkString
  val avroIdSchema = new Parser().parse(idSchema)
  val idMessage = new GenericData.Record(avroIdSchema)

  val idGenericRecord = new GenericData.Record(avroIdSchema.getField("myid").schema())
  idGenericRecord.put("myid1", "1234")
  val multiIdsMap = new java.util.HashMap[String,String]
  multiIdsMap.put("123" , "1234")
  idGenericRecord.put("multiids",multiIdsMap)
  idMessage .put("myid", idGenericRecord)
  //similar implemention for the other records, but it fails on the first one
}

Answer 1

如果您使用的是Scala和Avro，您可能会考虑使用某些库（我已成功使用avro4s）来处理样板文件。我想你的代码将等于：

case class MyId(myid1: String, multiids: Map[String,String])
case class YourId(yourid2: String, multiids: Map[String, String])
case class ExtraId(name: String)
case class MyTest(myid: MyIds, yourid: YourId, extraid: ExtraId)

val record = MyTest(
  MyId("test", Map("test1" -> "test2")),
  ...
)

val format = RecordFormat[MyTest]
// record is of type GenericRecord
val genericRecord = format.to(record)

编辑：对不起，我还没有注意到你已经测试了一些图书馆...... Here is some simple example of nested record population in Java。

在avro中填充嵌套记录

1 个答案: