我正在尝试使用ksql将单个输入流中的多个事件合并为按时间戳分组的单个输出事件。我还希望输出事件包含输入事件的平均值,尽管这并不是严格的定论,但更令人高兴。
输入流:温度
event1: {location: "hallway", value: 23, property_Id: "123", timestamp: "1551645625878"}
event2: {location: "bedroom", value: 21, property_Id: "123", timestamp: "1551645625878"}
event3: {location: "kitchen", value: 20, property_Id: "123", timestamp: "1551645625878"}
event4: {location: "hallway", value: 19, property_Id: "123", timestamp: "9991645925878"}
event5: {location: "bedroom", value: 18, property_Id: "123", timestamp: "9991645925878"}
event6: {location: "kitchen", value: 18, property_Id: "123", timestamp: "9991645925878"}
(所需)输出流:
event1:
{
"property_id": "123",
"timestamp": "1551645625878",
"average_temperature": 21,
"temperature": [
{
"location": "hallway",
"value": 23
},
{
"location": "bedroom",
"value": 21
},
{
"location": "kitchen",
"value": 20
}
]
}
event2:
{
"property_id": "123",
"timestamp": "9991645925878",
"average_temperature": 18,
"temperature": [
{
"location": "hallway",
"value": 19
},
{
"location": "bedroom",
"value": 18
},
{
"location": "kitchen",
"value": 18
}
]
}
据我所知,使用ksql无法做到这一点,任何人都可以确认吗?
答案 0 :(得分:1)
正确,您目前无法在KSQL中执行此操作。从v5.1版开始/至2019年3月,KSQL可以读取但不能构建嵌套对象:https://github.com/confluentinc/ksql/issues/2147(如果需要,请使用upvote // comment)
您可以使用以下方法进行平均计算:
SELECT timestamp, SUM(value)/COUNT(*) AS avg_temp \
FROM input_stream \
GROUP BY timestamp;