我的json结构如下:
{
"First":"xxxx",
"Country":"XX",
"Loop": {
"Links": [
{
"Url":"xxxx",
"Time":123
}, {
"Url":"xxxx",
"Time":123
}],
"TotalTime":123,
"Date":"2018-04-09T10:29:39.0233082+00:00"
}
我想提取属性
First
Country
Url & Time foreach object in the array
TotalTime
Date
这是我的查询
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
@extration =
EXTRACT
jsonString string
FROM @"/storage-api/input.json"
USING Extractors.Tsv(quoting:false);
@cleanUp = SELECT jsonString FROM @extration WHERE (!jsonString.Contains("Part: h" ) AND jsonString!= "465}");
@jsonify = SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(jsonString) AS obj FROM @cleanUp;
@columnized = SELECT
obj["First"] AS first,
obj["Country"] AS country
FROM @jsonify;
OUTPUT @columnized
TO @"/storage-api/outputs/tpe1-output.csv"
USING Outputters.Csv();
但是这个查询只提取了前2个属性,我不知道如何在“循环”中查询嵌套数据
答案 0 :(得分:1)
您可以使用MultiLevelJsonExtractor
(注释here)和JSON路径(例如Loop.Links[*]
)来执行此操作。 MultiLevelJsonExtractor
有一个很好的功能,如果你的节点没有找到你的基本路径,它会递归检查它,虽然我不确定性能如何扩展到大型JSON文档或大量的JSON文档。
试试这个:
DECLARE @input string = "/input/input65.json";
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
USING Microsoft.Analytics.Samples.Formats.Json;
@result =
EXTRACT First string,
Country string,
Date DateTime,
Url string,
Time string,
TotalTime int
FROM @input
USING new MultiLevelJsonExtractor("Loop.Links[*]",
false,
"First",
"Country",
"Date",
"Url",
"Time",
"TotalTime"
);
OUTPUT @result
TO "/output/output.csv"
USING Outputters.Csv();
我的结果:
HTH