如何使用Avro序列化Scala案例类?

时间:2016-08-15 12:56:41

标签: scala avro

跟进此问题:Avro serialisation cast error in Scala

使用Avro序列化Scala case课程的最佳方法是什么?

以下是我现在正在做的事情:

 def serializeSubmapRecord(record: MyRecord): Array[Byte] = {
    val out = new ByteArrayOutputStream()
    val encoder = EncoderFactory.get.binaryEncoder(out, null)
    val writer = new GenericDatumWriter[GenericRecord](avro_schema)
    val r = new GenericData.Record(avro_schema);
    r.put("my_number", 1);
    writer.write(r, encoder)
    encoder.flush
    out.close
    out.toByteArray
  }

Avro架构

{"namespace": "",
  "type": "record",
  "name": "MyRecord",
  "fields": [
    {"name": "my_number", "type": "int"}
  ]
}

但是我希望有这样的东西:

case class MyRecord(my_number: Int)
val record = new MyRecord(1)

def serializeSubmapRecord(record: MyRecord): Array[Byte] = {
val out = new ByteArrayOutputStream()
val encoder = EncoderFactory.get.binaryEncoder(out, null)
val writer = new GenericDatumWriter[MyRecord](avro_schema)
writer.write(record, encoder)
encoder.flush
out.close
out.toByteArray

}

最后一段代码给出了链接问题的例外。我做错了什么?

3 个答案:

答案 0 :(得分:5)

另一种选择是使用scala库avro4s。免责声明:这是我的项目。

所以你可以创建这样的架构:

case class MyRecord(my_number: Int)

val schema = AvroSchema[MyRecord]

val record = new MyRecord(1)

或写出一个像你问题中的字节数组:

val baos = new ByteArrayOutputStream()
val os = AvroOutputStream.data[MyRecord](baos)
os.write(record)
os.close()

答案 1 :(得分:1)

我认为你需要的是SpecificDatumWriter而不是通用的。

case class MyRecord(my_number: Int)

val record = new MyRecord(1)

def serializeSubmapRecord(record: MyRecord): Array[Byte] = {
  ByteArrayOutputStream out = new ByteArrayOutputStream();
  Encoder encoder = EncoderFactory.get().directBinaryEncoder(out, null);

  // specific writer
  SpecificDatumWriter<MyRecord> writer = new SpecificDatumWriter<MyRecord>(avro_schema);
  writer.write(record, encoder);
  encoder.flush();
  ByteBuffer serialized = ByteBuffer.allocate(out.toByteArray().length);
  serialized.put(out.toByteArray());
  return serialized.array();
}

答案 2 :(得分:0)

尝试NoSchema库 它具有更通用的设计,可将Scala类型反射(案例类)与数据类型转换(嵌套映射,JSON或Avro)分离 使用可自定义的规则处理SerDes。 反射部分可以通过基于TypeTag的无形反射或运行时反射来完成

https://github.com/yongjiaw/datacrafts