Avro模式

{
  "namespace": "com",
  "type": "record",
  "name": "customers",
  "fields": [
    {
      "name": "customer",
      "type": {
        "type": "array",
        "items": {
          "name": "cust",
          "type": "record",
          "fields": [
            {
              "name": "age",
              "type": ["long","null"]
            },
            {
              "name": "amount",
              "type": [ "long","null"]
            }
          ]
        }
      }
    }
  ]
}

Python代码

list = [[34,2000]，[53,8000]]

对于列表中的l

    writer.append({"customer":{ "age": long(l[0]), "amount": long(l[1])}})

我的解析错误吗？我应该在数组中添加任何基准对象吗？

Answer 1

您的架构将customers记录定义为具有cust个记录的数组。因此，您的数据应采用以下结构：

{"customer": [cust1, cust2, ...]}

并进一步扩展：

{"customer": [{"age": X1, "amount": Y1}, {"age": X2, "amount": Y2}, ...]}

因此您可以保持架构不变，但是您将需要更改要插入的数据以匹配上述格式。另外，您可以按原样保留数据，但是需要将架构更改为以下内容：

{
  "namespace": "com",
  "type": "record",
  "name": "customers",
  "fields": [
    {
      "name": "customer",
      "type": {
        "name": "cust",
        "type": "record",
        "fields": [
          {
            "name": "age",
            "type": ["long","null"]
          },
          {
            "name": "amount",
            "type": [ "long","null"]
          }
        ]
      }
    }
  ]
}

Answer 2

弄清楚了。我的avro模式非常好，唯一的改变就是我将对象添加到writer的方式。

list= [[34,2000],[53,8000]]
customer={}                                                                                                                  
cust ={}                                                                                                     
for l in list                                                                                                             
    cust['age']  = l[0]  
    cust['amount'] = l[1]                                     
customer.append(cust)                                                                                                            
writer.append({"customer": customer})

解析在python上的嵌套avro结构失败

Avro模式

Python代码

2 个答案: