我已经看了几个星期(在后台),并且很难理解如何使用NiFi JoltTransformJson处理器将近似CSV的JSON数据转换为标记集。我的意思是使用输入中数组第一行的数据作为输出中的JSON对象名。
作为一个例子,我有这个输入数据:
[
[
"Company",
"Retail Cost",
"Percentage"
],
[
"ABC",
"5,368.11",
"17.09%"
],
[
"DEF",
"101.47",
"0.32%"
],
[
"GHI",
"83.79",
"0.27%"
]
]
我想要输出的是:
[
{
"Company": "ABC",
"Retail Cost": "5,368.11",
"Percentage": "17.09%"
},
{
"Company": "DEF",
"Retail Cost": "101.47",
"Percentage": "0.32%"
},
{
"Company": "GHI",
"Retail Cost": "83.79",
"Percentage": "0.27%"
}
]
我认为这主要是两个问题:访问第一个数组的内容,然后确保输出数据不包含第一个数组。
我希望发布一个Jolt规范,显示我自己有点接近,但最接近的是给我正确的输出形状而没有正确的内容。它看起来像这样:
[
{
"operation": "shift",
"spec": {
"*": {
"*": "[&1].&0"
}
}
}
]
但它会产生如下输出:
[ {
"0" : "Company",
"1" : "Retail Cost",
"2" : "Percentage"
}, {
"0" : "ABC",
"1" : "5,368.11",
"2" : "17.09%"
}, {
"0" : "DEF",
"1" : "101.47",
"2" : "0.32%"
}, {
"0" : "GHI",
"1" : "83.79",
"2" : "0.27%"
} ]
显然有错误的对象名称,输出中有太多元素。
答案 0 :(得分:2)
可以做到,但哇难以阅读/看起来像可怕的正则表达式
规格
[
{
// this does most of the work, but producs an output
// array with a null in the Zeroth space.
"operation": "shift",
"spec": {
// match the first item in the outer array and do
// nothing with it, because it is just "header" data
// e.g. "Company", "Retail Cost", "Percentage".
// we need to reference it, but not pass it thru
"0": null,
//
// loop over all the rest of the items in the outer array
"*": {
// this is rather confusing
// "*" means match the array indices of the innner array
// and we will write the value at that index "ABC" etc
// to "[&1].@(2,[0].[&])"
// "[&1]" means make the ouput be an array, and at index
// &1, which is the index of the outer array we are
// currently in.
// Then "lookup the key" (Company, Retail Cost) using
// @(2,[0].[&])
// Which is go back up the tree to the root, then
// come back down into the first item of the outer array
// and Index it by the by the array index of the current
// inner array that we are at.
"*": "[&1].@(2,[0].[&])"
}
}
},
{
// We know the first item in the array will be null / junk,
// because the first item in the input array was "header" info.
// So we match the first item, and then accumulate everything
// into a new array
"operation": "shift",
"spec": {
"0": null,
"*": "[]"
}
}
]