如何使用原始邮件的字段子集数组将传感器事件收集到每小时文档中:
传入的事件具有以下格式:
{"plantId": "Plant A", "machineId" : "M001", "sensorId": "S001", "unit": "kg", "time": "2017-09-05T22:00:14.9410000Z", "value": 1234.56}
{"plantId": "Plant A", "machineId" : "M001", "sensorId": "S001", "unit": "kg", "time": "2017-09-05T22:00:19.5410000Z", "value": 1334.76}
...
我想每小时为每个传感器获取以下输出:
{"plantId": "Plant A", "machineId" : "M001", "sensorId": "S001", "unit": "kg",
"from" : "2017-09-05T22:00:14.9410000Z", "to" : "2017-09-05T22:59:55.5410000Z",
"datat": [
{"time": "2017-09-05T22:01:14.9410000Z", "value": 1234.56},
{"time": "2017-09-05T22:01:19.5410000Z", "value": 1334.76},
....
]
}
我创建了以下查询:
SELECT PlantId, MachineId, SensorId, Unit,
MIN(Time) AS From, MAX(Time) AS To,
Collect() AS Data
INTO CosmosDBOutput
FROM SensorsInput TIMESTAMP BY CAST(time as datetime)
GROUP BY PlantId, MachineId, SensorId, Unit, TumblingWindow(hour,1)
问题是collect返回所有原始事件的完整数组。但我想在其中只有时间和价值字段。
如何将Collect()结果减少到这些字段?
答案 0 :(得分:2)
根据您的说明,我建议您考虑使用JavaScript user-defined functions。
您可以定义一个自定义函数来删除无用的值。
更多细节,您可以参考以下步骤:
1.创建UDF:
2.将以下代码添加到功能
// Sample UDF which returns sum of two values.
function main(InputJSON) {
for (i = 0; i < InputJSON.length; i++) {
delete InputJSON[i].plantId;
delete InputJSON[i].machineId;
delete InputJSON[i].sensorId;
delete InputJSON[i].unit;
}
return InputJSON;
}
3.更改查询:
注意:将UDF.remove替换为您的UDF名称。(UDF.yourUDFname)
SELECT
PlantId, MachineId, SensorId, Unit,UDF.remove(Collect()) AS Data,min(time) as fromdate,max(time) as todate
INTO
[YourOutputAlias]
FROM
[YourInputAlias] TIMESTAMP BY time
GROUP BY PlantId, MachineId, SensorId, Unit, TumblingWindow(hour,1)
结果: