U-SQL无法从JSON文件中提取数据

时间:2016-03-10 10:23:00

标签: azure-data-factory azure-data-lake u-sql

我试图使用USQL从JSON文件中提取数据。查询运行成功而不生成任何输出数据或导致“顶点失败快速错误”。

JSON文件如下所示:

{
  "results": [
    {
      "name": "Sales/Account",
      "id": "7367e3f2-e1a5-11e5-80e8-0933ecd4cd8c",
      "deviceName": "HP",
      "deviceModel": "g6-pavilion",
      "clientip": "0.41.4.1"
    },
    {
      "name": "Sales/Account",
      "id": "c01efba0-e0d5-11e5-ae20-af6dc1f2c036",
      "deviceName": "acer",
      "deviceModel": "veriton",
      "clientip": "10.10.14.36"
    }
  ]
}

我的U-SQL脚本是

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];

DECLARE @in string="adl://xyz.azuredatalakestore.net/todelete.json";

DECLARE @out string="adl://xyz.azuredatalakestore.net/todelete.tsv";

@trail2=EXTRACT results string FROM @in USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

@jsonify=SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(results,"name","id","deviceName","deviceModel","clientip") AS rec FROM @trail2;

@logSchema=SELECT rec["name"] AS sysName,
              rec["id"] AS sysId,
              rec["deviceName"] AS domainDeviceName,
              rec["deviceModel"] AS domainDeviceModel,
              rec["clientip"] AS domainClientIp 
       FROM @jsonify;

OUTPUT @logSchema TO @out USING Outputters.Tsv();

2 个答案:

答案 0 :(得分:9)

实际上JSONExtractor支持JSONPath中表示的rowpath参数,使您能够识别要映射到行的JSON对象或JSON数组项。因此,您可以使用JSON文档中的单个语句提取数据:

@logSchema = 
    EXTRACT name string, id string, deviceName string, deviceModel string, clientip string
    FROM @input
   USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor("results[*]");

答案 1 :(得分:2)

萨拉,

问题是你的@ trail2输出是一个json数组" [{...},{...}]"据我所知,JsonFunction无法解析。所以我把它输出到一个文件并用输入器重新读取它,它可以解析数组。

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];

DECLARE @in string="adl://xyz.azuredatalakestore.net/todelete.json";
DECLARE @out string="adl://xyz.azuredatalakestore.net/todelete.tsv";
DECLARE @mid string="adl://xyz.azuredatalakestore.net/intermediate.txt";


@trail2=EXTRACT results string FROM @in USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

OUTPUT @trail2 TO @mid USING Outputters.Text(quoting:false);

@jsonify=EXTRACT name string,
                id string, 
                deviceName string ,
                deviceModel string,
                clientip string
FROM @mid USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

@logSchema=SELECT name AS sysName,
              id AS sysId,
              deviceName AS domainDeviceName,
              deviceModel AS domainDeviceModel,
              clientip AS domainClientIp 
       FROM @jsonify;

OUTPUT @logSchema TO @out USING Outputters.Tsv();