在Azure数据工厂中将JSON文件从平面数组转换为嵌套数组

时间:2019-05-17 09:53:12

标签: arrays json azure azure-data-factory

我试图将数据从Oracle数据库复制到搜索索引中,并且正在使用Azure数据工厂将数据从oracle复制到Azure Blob存储。 我如何使用它,将数据作为嵌套的JSON文件导入。 现在,在查询Oracle之后,我得到如下数据:

[{"BOOKING_ID":1.0,"REFERENCES":"ABC00001","ROUTES":{"ROUTE":1.0,"DESTINATION":"Atlanta, USA","ORIGIN":"New York, USA"}}
,{"BOOKING_ID":2.0,"REFERENCES":"ABC00322","ROUTES":{"ROUTE":2.0,"DESTINATION":"Las Vegas, USA","ORIGIN":"Los Angeles, USA"}}
,{"BOOKING_ID":3.0,"REFERENCES":"ABC32322","ROUTES":{"ROUTE":3.0,"DESTINATION":"Berlin, GER","ORIGIN":"Moscow, RUS"}}
,{"BOOKING_ID":4.0,"REFERENCES":"ABC543345","ROUTES":{"ROUTE":4.0,"DESTINATION":"Rome, ITA","ORIGIN":"Bejin, CHN"}}
,{"BOOKING_ID":5.0,"REFERENCES":"ABC51145","ROUTES":{"ROUTE":5.0,"DESTINATION":"Warsaw, POL","ORIGIN":"Copenhagen, DEN"}}
,{"BOOKING_ID":5.0,"REFERENCES":"ABC51145","ROUTES":{"ROUTE":6.0,"DESTINATION":"Copenhaged, DEN","ORIGIN":"Paris, FRA"}}
,{"BOOKING_ID":5.0,"REFERENCES":"ABC51145","ROUTES":{"ROUTE":7.0,"DESTINATION":"Paris, FRA","ORIGIN":"Madrid, ESP"}}
]

但是我需要这样的数据:

[
  {
    "BOOKING_ID": 1.0,
    "REFERENCES": "ABC00001",
    "ROUTES": [
      {
        "ROUTE": 1.0,
        "DESTINATION": "Atlanta, USA",
        "ORIGIN": "New York, USA"
      }
    ]
  },
  {
    "BOOKING_ID": 2.0,
    "REFERENCES": "ABC00322",
    "ROUTES": [
      {
        "ROUTE": 2.0,
        "DESTINATION": "Las Vegas, USA",
        "ORIGIN": "Los Angeles, USA"
      }
    ]
  },
  {
    "BOOKING_ID": 3.0,
    "REFERENCES": "ABC32322",
    "ROUTES": [
      {
        "ROUTE": 3.0,
        "DESTINATION": "Berlin, GER",
        "ORIGIN": "Moscow, RUS"
      }
    ]
  },
  {
    "BOOKING_ID": 4.0,
    "REFERENCES": "ABC543345",
    "ROUTES": [
      {
        "ROUTE": 4.0,
        "DESTINATION": "Rome, ITA",
        "ORIGIN": "Bejin, CHN"
      }
    ]
  },
  {
    "BOOKING_ID": 5.0,
    "REFERENCES": "ABC51145",
    "ROUTES": [
      {
        "ROUTE": 5.0,
        "DESTINATION": "Warsaw, POL",
        "ORIGIN": "Copenhagen, DEN"
      },
      {
        "ROUTE": 6.0,
        "DESTINATION": "Copenhaged, DEN",
        "ORIGIN": "Paris, FRA"
      },
      {
        "ROUTE": 7.0,
        "DESTINATION": "Paris, FRA",
        "ORIGIN": "Madrid, ESP"
      }
    ]
  }
]

更新 我将Azure函数与lodash一起使用,但是现在我正尝试从Azure Blob存储接收JSON。问题是,当我尝试读取JSON时会得到如下结果:

"type": "Buffer",
    "data": [
        239,
        187,
        191,
        91,
        123,
...

所有数据均为字节类型。

1 个答案:

答案 0 :(得分:0)

您的需求按BOOKING_ID分组,将ROUTES对象合并到一个数组中。不能直接在复制活动中实现。

两个想法:

1。使用Web Activity + Azure Function Activity

在Web活动中,将查询方法封装到REST API中并返回平面json数据。将Web活动的输出传递到Azure Function活动中。在Azure Function方法中,根据需要将数组json数据循环到嵌套数组中,然后将Azure Function的输出配置为Azure Blob存储。(请参阅此link

2。使用Custom Activity

您可以在依赖VM的Azure批处理作业中执行脚本。例如,您可以使用cx-Oracle包通过BOOKING_ID查询json数据顺序,然后使用python代码循环结果并根据需要将其转换。