汇总上使用了错误的序列化器

时间:2018-08-22 11:59:54

标签: apache-kafka-streams

我正在使用kafka-streams应用程序,在其中处理日志事件。在这种情况下,我想将WorkflowInput类型聚合为一个Workflow类型。我在使聚合工作时遇到问题。

final KStream<String, WorkflowInput> filteredStream = someStream;
final KTable<String, Workflow> aggregatedWorkflows = filteredStream
    .peek((k, v) -> {
        if (!(v instanceof WorkflowInput)) {
            throw new AssertionError("Type not expected");
        }
    })
    .groupByKey()
    .<Workflow>aggregate(Workflow::new, (k, input, workflow) -> workflow.updateFrom(input),
            Materialized.<String, Workflow, KeyValueStore<Bytes, byte[]>>as("worflow-cache")
                .withKeySerde(Serdes.String())
                .withValueSerde(Serdes.serdeFrom(new JsonSerializer<Workflow>(), new JsonDeserializer<Workflow>(Workflow.class)));

我遇到以下异常:由org.apache.kafka.streams.errors.StreamsException: A serializer (key: org.apache.kafka.common.serialization.StringSerializer / value: org.apache.kafka.common.serialization.StringSerializer) is not compatible to the actual key or value type (key type: java.lang.String / value type: workflowauditstreamer.WorkflowInput).

引起

需要注意的两件事:  *值序列化器是StringSerializer,而我使用withValueSerde配置了不同的东西。  *实际值类型为WorkflowInput,而我期望Workflow,因为这是我的汇总值类型。

我是kafka-stream的新手,所以我可能会遗漏一些明显的东西,但我无法弄清楚。我在这里想念什么?

1 个答案:

答案 0 :(得分:1)

如果您从配置中覆盖默认的let obj = {"data":[{"label":"employeeCount","stats":[{"year":"2015","value":"10"},{"year":"2017","value":"30"},{"year":"2016","value":"50"}]},{"label":"managerCount","stats":[{"year":"2015","value":"2"},{"year":"2017","value":"4"},{"year":"2016","value":"6"}]}]}; // Create the response object let r = {"record":{}}; // Iterate over data array obj.data.forEach(o => { // Iterate over stats for each object in data array o.stats.forEach(s => { // Create entry for year in result object if it does not exist r.record[s.year] = r.record[s.year] || {}; // Add the label of data array object with corresponding stat value in resultant object r.record[s.year][o.label] = s.value; }); }); console.log(r);,则它在操作员就地覆盖中。它不会传播到下游(Kafka 2.0-有WIP可以对此进行改进)。

因此,您也需要将在Serde中使用的Serde传入someStream = builder.stream(...)