KSQL事件合并-根据时间戳合并单个流中的事件

时间:2019-03-04 06:21:30

标签: apache-kafka ksql

我正在尝试使用ksql将单个输入流中的多个事件合并为按时间戳分组的单个输出事件。我还希望输出事件包含输入事件的平均值,尽管这并不是严格的定论,但更令人高兴。

输入流:温度

event1: {location: "hallway", value: 23, property_Id: "123", timestamp: "1551645625878"} 
event2: {location: "bedroom", value: 21, property_Id: "123", timestamp: "1551645625878"}
event3: {location: "kitchen", value: 20, property_Id: "123", timestamp: "1551645625878"}
event4: {location: "hallway", value: 19, property_Id: "123", timestamp: "9991645925878"} 
event5: {location: "bedroom", value: 18, property_Id: "123", timestamp: "9991645925878"}
event6: {location: "kitchen", value: 18, property_Id: "123", timestamp: "9991645925878"}

(所需)输出流:

event1:
{
    "property_id": "123",
    "timestamp": "1551645625878",
    "average_temperature": 21,   
    "temperature": [
        {
            "location": "hallway",
            "value": 23
        },
        {
            "location": "bedroom",
            "value": 21
        },
        {
            "location": "kitchen",
            "value": 20
        }
    ]
}

event2:
{
    "property_id": "123",
    "timestamp": "9991645925878",
    "average_temperature": 18,   
    "temperature": [
        {
            "location": "hallway",
            "value": 19
        },
        {
            "location": "bedroom",
            "value": 18
        },
        {
            "location": "kitchen",
            "value": 18
        }
    ]
}

据我所知,使用ksql无法做到这一点,任何人都可以确认吗?

1 个答案:

答案 0 :(得分:1)

正确,您目前无法在KSQL中执行此操作。从v5.1版开始/至2019年3月,KSQL可以读取但不能构建嵌套对象:https://github.com/confluentinc/ksql/issues/2147(如果需要,请使用upvote // comment)

您可以使用以下方法进行平均计算:

SELECT timestamp, SUM(value)/COUNT(*) AS avg_temp \
  FROM input_stream \
  GROUP BY timestamp;