Json数组的Avro架构

时间:2016-03-16 15:43:44

标签: json serialization avro

假设我有以下json:

abc[letter - 'a'] = letter;

这个对象数组的适当avro架构是什么?

2 个答案:

答案 0 :(得分:1)

[简答]
此对象数组的相应avro架构如下所示:

const type = avro.Type.forSchema({
  type: 'array',
  items: { type: 'record', fields:
   [ { name: 'id', type: 'int' },
     { name: 'text', type: 'string' },
     { name: 'user_id', type: 'int' } ]
  }
});

[长答案]
我们可以使用Avro帮助我们通过给定的数据对象构建上述模式 让我们使用npm包“avsc”,这是“Avro规范的纯JavaScript实现” 由于Avro可以推断出一个值的模式,我们可以使用以下技巧来获取给定数据的模式(遗憾的是它似乎无法显示嵌套模式,但我们可以要求两次 - 对于顶级结构(数组),然后对于数组元素):< / p>

// don't forget to install avsc
// npm install avsc
//
const avro = require('avsc');

// avro can infer a value's schema
const type = avro.Type.forValue([
   {"id":1,"text":"some text","user_id":1}
]);

const type2 = avro.Type.forValue(
   {"id":1,"text":"some text","user_id":1}
);


console.log(type.getSchema());
console.log(type2.getSchema());

输出:

{ type: 'array',
  items: { type: 'record', fields: [ [Object], [Object], [Object] ] } }
{ type: 'record',
  fields:
   [ { name: 'id', type: 'int' },
     { name: 'text', type: 'string' },
     { name: 'user_id', type: 'int' } ] }

现在让我们编写正确的模式并尝试使用它来序列化对象然后反序列化它!

const avro = require('avsc');
const type = avro.Type.forSchema({
  type: 'array',
  items: { type: 'record', fields:
   [ { name: 'id', type: 'int' },
     { name: 'text', type: 'string' },
     { name: 'user_id', type: 'int' } ]
  }
});
const buf = type.toBuffer([
   {"id":1,"text":"some text","user_id":1},
   {"id":1,"text":"some text","user_id":2}]); // Encoded buffer.

const val = type.fromBuffer(buf);
console.log("deserialized object: ", JSON.stringify(val, null, 4));  // pretty print deserialized result

var fs = require('fs');
var full_filename = "/tmp/avro_buf.dat";
fs.writeFile(full_filename, buf, function(err) {
    if(err) {
        return console.log(err);
    }

    console.log("The file was saved to '" + full_filename + "'");
});

输出:

deserialized object:  [
    {
        "id": 1,
        "text": "some text",
        "user_id": 1
    },
    {
        "id": 1,
        "text": "some text",
        "user_id": 2
    }
]
The file was saved to '/tmp/avro_buf.dat'

我们甚至可以享受上述练习的紧凑二进制表示:

hexdump -C /tmp/avro_buf.dat
00000000  04 02 12 73 6f 6d 65 20  74 65 78 74 02 02 12 73  |...some text...s|
00000010  6f 6d 65 20 74 65 78 74  04 00                    |ome text..|
0000001a

很好,不是吗? - )

答案 1 :(得分:0)

关于您的问题,正确的架构是

{
  "name": "Name",
  "type": "array",
  "namespace": "com.hi.avro.model",
  "items": {
    "name": "NameDetails",
    "type": "record",
    "fields": [
      {
        "name": "id",
        "type": "int"
      },
      {
        "name": "text",
        "type": "string"
      },
      {
        "name": "user_id",
        "type": "int"
      }
    ]
  }
}