我正在尝试将一些记录的事件从Application Insights提取到我们的SQL数据库中。我无法控制输入的格式,这些输入是由文件中的多个json数组组成的json文件。在每个记录中,5个信息位于[context]中的json数组中。[custom]。[dimension]在文件中并使用OUTER APPLY展平这些值。问题是它返回的结果不是每条记录一行,而是好像你已经连接了一行5(这确实是它已经完成的),并且5个数据的值在4种情况下是NULL,而实际值是另一个。我只需要5个值中的2个 - PageType和UserId - 并且在我的GROUP BY中给出它返回3条记录,每条记录包含一条记录,其中一条记录都是null。
在普通的SQL中,您只需使用MAX表达式来获取每个值的实际值,但在Stream Analytics中您不能在字符串上使用MAX。您也无法使用COALESCE以及我尝试解决此问题的其他一些方法。任何想法如何改变结果:
EventDateTime Event PageType UserId AppVersion CountA
2017-05-24 Nav Show NULL NULL 2.0.1293 1
2017-05-24 Nav Show NULL SIRTSW 2.0.1293 1
2017-05-24 Nav Show Trade NULL 2.0.1293 1
到
2017-05-24 Nav Show Trade SIRTSW 2.0.1293 1 ?
每个返回三行的代码如下(请注意,e.event是一个项目的数组,因此它不会导致同样的问题):
SELECT flatEvent.ArrayValue.name as Event,
e.context.data.eventTime as EventDateTime,
e.context.application.version as AppVersion
,flatCustom.ArrayValue.UserId as UserId
,flatCustom.ArrayValue.PageType as PageType,
SUM(flatEvent.ArrayValue.count) as CountA
INTO
[insights]
FROM [ios] e
CROSS APPLY GetArrayElements(e.[event]) as flatEvent
OUTER APPLY GetArrayElements(e.[context].[custom].[dimensions]) as flatCustom
GROUP BY SlidingWindow(minute, 1),
flatEvent.ArrayValue.name,
e.context.data.eventTime,
e.context.application.version,
flatCustom.ArrayValue.UserId,
flatCustom.ArrayValue.PageType
提前致谢, 罗布
答案 0 :(得分:1)
根据您的方案,我假设您可以使用JavaScript user-defined functions进行Azure流分析,将多个维度合并为一个记录。以下是我对此问题的测试,您可以参考它们。
JSON文件
{
"context":{
"data":{"eventTime":"2017-05-24"},
"application":{"version":"2.0.1293"},
"custom":{
"dimensions":[
{"PageType":null,"UserId":"SIRTSW"},
{"PageType":"Trade","UserId":null},
{"PageType":null,"UserId":null}
]
}
},
"event":[
{"name":"Nav Show","count":1}
]
}
javascript UDF,UDF.coalesce
function main(items) {
var result=[];
var UserIdStr="",PageTypeStr="";
for(var i=0;i<items.length;i++){
if(items[i].UserId!=null && items[i].UserId!=undefined)
UserIdStr+=items[i].UserId;
if(items[i].PageType!=null && items[i].PageType!=undefined)
PageTypeStr+=items[i].PageType;
}
result.push({UserId:UserIdStr,PageType:PageTypeStr});
return result;
}
<强>查询强>
--first query
WITH f AS (
SELECT
e.context.data.eventTime as EventDateTime,
e.context.application.version as AppVersion,
e.event as flatEvent,
UDF.coalesce(e.[context].[custom].[dimensions]) as flatDimensions
FROM [ios] e
)
--second query
SELECT flatEvent.ArrayValue.name as Event,
f.EventDateTime,
f.AppVersion,
flatDimension.ArrayValue.UserId,
flatDimension.ArrayValue.PageType,
SUM(flatEvent.ArrayValue.count) as CountA
FROM f
CROSS APPLY GetArrayElements(f.[flatEvent]) as flatEvent
OUTER APPLY GetArrayElements(f.[flatDimensions]) as flatDimension
GROUP BY SlidingWindow(minute, 1),
flatEvent.ArrayValue.name,
f.EventDateTime,
f.AppVersion,
flatDimension.ArrayValue.UserId,
flatDimension.ArrayValue.PageType