Nifi MergeRecord& MergeContent无法合并具有不同架构的avro flow fiels

时间:2018-05-21 12:17:13

标签: json avro apache-nifi

我使用NiFi Flow作为ListFile>> FetchFile>> SplitJson>> UpdateAttribute>> FlattenJson>> InferAvroSchema>> ConvertRecord>> MergeRecord>> PutParquet。

Json输入:

[{
       "Id": 1235,
        "Username": "fred1235",
        "Name": "Fred",
        "ShippingAddress": {
            "Address1": "456 Main St.",
            "Address2": "",
            "City": "Durham",
            "State": "NC"
        }

    },{

        "Id": 1236,
        "Username": "larry1234",
        "Name": "Larry",
        "ShippingAddress": {
            "Address1": "789 Main St.",
            "Address2": "",
            "City": "Durham",
            "State": "NC",
            "PostalCode": 277453
        },
        "Orders": [{
                "ItemId": 1111,
                "OrderDate": "11/11/2012"
            }, {
                "ItemId": 2222,
                "OrderDate": "12/12/2012"
        }]

}]

MergeRecord处理器没有给出“Orders”:合并文件架构中的数组。与MergeContent处理器相同的问题。

1 个答案:

答案 0 :(得分:1)

不是使用SplitJson和FlattenJson,而是可以使用JoltTransformJSON和以下ChainR规范来展平整个事物而不进行拆分:

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        "ShippingAddress": {
          "Address1": "[&2].ShippingAddress_Address1",
          "Address2": "[&2].ShippingAddress_Address2",
          "City": "[&2].ShippingAddress_City",
          "State": "[&2].ShippingAddress_State"
        },
        "Orders": {
          "*": {
            "ItemId": "[&3].Orders_&1_ItemId",
            "OrderDate": "[&3].Orders_&1_OrderDate"
          }
        },
        "*": "[&1].&"
      }
    }
  }
]

不确定ConvertRecord的用途,但您不再需要MergeRecord。如果这不是您正在寻找的输出,请告诉我您的期望(对于两个记录,有和没有订单字段的记录),我很乐意提供帮助。