我有一个Spring Cloud Kafka Streams应用程序,该应用程序可以对传入的数据进行密钥更新,以便能够加入两个主题,即selectkeys,mapvalues和聚合数据。随着时间的流逝,消费者的滞后似乎会增加,并且通过添加应用程序的多个实例来扩大规模并没有多大帮助。在每种情况下,消费者的滞后似乎都在增加。
我将实例从1缩放到18,但没有太大的区别。邮件数量滞后,每5秒不断增加,与实例数量无关
KStream<String, MappedOriginalSensorData> flattenedOriginalData = originalData
.flatMap(flattenOriginalData())
.through("atl-mapped-original-sensor-data-repartition", Produced.with(Serdes.String(), new MappedOriginalSensorDataSerde()));
//#2. Save modelid and algorithm parts of the key of the errorscore topic and reduce the key
// to installationId:assetId:tagName
//Repartition ahead of time avoiding multiple repartition topics and thereby duplicating data
KStream<String, MappedErrorScoreData> enrichedErrorData = errorScoreData
.map(enrichWithModelAndAlgorithmAndReduceKey())
.through("atl-mapped-error-score-data-repartition", Produced.with(Serdes.String(), new MappedErrorScoreDataSerde()));
return enrichedErrorData
//#3. Join
.join(flattenedOriginalData, join(),
JoinWindows.of(
// allow messages within one second to be joined together based on their timestamp
Duration.ofMillis(1000).toMillis())
// configure the retention period of the local state store involved in this join
.until(Long.parseLong(retention)),
Joined.with(
Serdes.String(),
new MappedErrorScoreDataSerde(),
new MappedOriginalSensorDataSerde()))
//#4. Set instalation:assetid:modelinstance:algorithm::tag key back
.selectKey((k,v) -> v.getOriginalKey())
//#5. Map to ErrorScore (basically removing the originalKey field)
.mapValues(removeOriginalKeyField())
.through("atl-joined-data-repartition");
然后是聚合部分:
Materialized<String, ErrorScore, WindowStore<Bytes, byte[]>> materialized = Materialized
.as(localStore.getStoreName());
// Set retention of changelog topic
materialized.withLoggingEnabled(topicConfig);
// Configure how windows looks like and how long data will be retained in local stores
TimeWindows configuredTimeWindows = getConfiguredTimeWindows(
localStore.getTimeUnit(), Long.parseLong(topicConfig.get(RETENTION_MS)));
// Processing description:
// 2. With the groupByKey we group the data on the new key
// 3. With windowedBy we split up the data in time intervals depending on the provided LocalStore enum
// 4. With reduce we determine the maximum value in the time window
// 5. Materialized will make it stored in a table
stream.groupByKey()
.windowedBy(configuredTimeWindows)
.reduce((aggValue, newValue) -> getMaxErrorScore(aggValue, newValue), materialized);
}
private TimeWindows getConfiguredTimeWindows(long windowSizeMs, long retentionMs) {
TimeWindows timeWindows = TimeWindows.of(windowSizeMs);
timeWindows.until(retentionMs);
return timeWindows;
}
我希望增加实例数量将大大减少使用者的延迟。
因此,在此设置中涉及多个主题,例如: *原始传感器数据 *错误分数 * kstream-joinother * kstream-jointhis * atl映射的原始传感器数据分区 * atl映射的错误分数数据分区 * atl-joined-data-partition
这个想法是将原始传感器数据与错误分数结合在一起。密钥更新需要atl-mapped- *主题。那么该联接将使用kstream *主题,最后,作为联接的结果,将填充atl-joined-data-repartition。之后,聚合也创建了主题,但我现在将其排除在范围之外。
original-sensor-data
\
\
\ atl-mapped-original-sensor-data-repartition-- kstream-jointhis -\
/ atl-mapped-error-score-data-repartition -- kstream-joinother -\
/ \
error-score atl-joined-data-repartition
由于自从我介绍了join和atl映射的主题以来,增加实例数似乎不再具有太大的影响,我想知道这种拓扑是否有可能成为其自身的瓶颈。从消费者的滞后来看,与atl-mapped- *主题相比,原始传感器数据和错误得分主题的消费者滞后似乎要小得多。是否有办法通过删除这些变更日志来解决此问题,或者是否导致无法扩展?