使用变更日志是否会导致应用程序本身成为瓶颈?

时间:2019-05-22 14:38:33

标签: apache-kafka-streams spring-cloud-stream

我有一个Spring Cloud Kafka Streams应用程序,该应用程序可以对传入的数据进行密钥更新,以便能够加入两个主题,即selectkeys,mapvalues和聚合数据。随着时间的流逝,消费者的滞后似乎会增加,并且通过添加应用程序的多个实例来扩大规模并没有多大帮助。在每种情况下,消费者的滞后似乎都在增加。

我将实例从1缩放到18,但没有太大的区别。邮件数量滞后,每5秒不断增加,与实例数量无关

KStream<String, MappedOriginalSensorData> flattenedOriginalData = originalData
                .flatMap(flattenOriginalData())
                .through("atl-mapped-original-sensor-data-repartition", Produced.with(Serdes.String(), new MappedOriginalSensorDataSerde()));


        //#2. Save modelid and algorithm parts of the key of the errorscore topic and reduce the key
        //    to installationId:assetId:tagName
        //Repartition ahead of time avoiding multiple repartition topics and thereby duplicating data
        KStream<String, MappedErrorScoreData> enrichedErrorData = errorScoreData
                .map(enrichWithModelAndAlgorithmAndReduceKey())
                .through("atl-mapped-error-score-data-repartition", Produced.with(Serdes.String(), new MappedErrorScoreDataSerde()));


        return enrichedErrorData
                //#3. Join
                .join(flattenedOriginalData, join(),
                        JoinWindows.of(
                                // allow messages within one second to be joined together based on their timestamp
                                Duration.ofMillis(1000).toMillis())
                                // configure the retention period of the local state store involved in this join
                                .until(Long.parseLong(retention)),
                        Joined.with(
                                Serdes.String(),
                                new MappedErrorScoreDataSerde(),
                                new MappedOriginalSensorDataSerde()))
                //#4. Set instalation:assetid:modelinstance:algorithm::tag key back
                .selectKey((k,v) -> v.getOriginalKey())
                //#5. Map to ErrorScore (basically removing the originalKey field)
                .mapValues(removeOriginalKeyField())
                .through("atl-joined-data-repartition");

然后是聚合部分:

        Materialized<String, ErrorScore, WindowStore<Bytes, byte[]>> materialized = Materialized
                .as(localStore.getStoreName());

        // Set retention of changelog topic
        materialized.withLoggingEnabled(topicConfig);

        // Configure how windows looks like and how long data will be retained in local stores
        TimeWindows configuredTimeWindows = getConfiguredTimeWindows(
                localStore.getTimeUnit(), Long.parseLong(topicConfig.get(RETENTION_MS)));

        // Processing description:
        // 2. With the groupByKey we group  the data on the new key
        // 3. With windowedBy we split up the data in time intervals depending on the provided LocalStore enum
        // 4. With reduce we determine the maximum value in the time window
        // 5. Materialized will make it stored in a table
        stream.groupByKey()
        .windowedBy(configuredTimeWindows)
        .reduce((aggValue, newValue) -> getMaxErrorScore(aggValue, newValue), materialized);
    }

    private TimeWindows getConfiguredTimeWindows(long windowSizeMs, long retentionMs) {
        TimeWindows timeWindows = TimeWindows.of(windowSizeMs);
        timeWindows.until(retentionMs);
        return timeWindows;
    }

我希望增加实例数量将大大减少使用者的延迟。

因此,在此设置中涉及多个主题,例如: *原始传感器数据 *错误分数 * kstream-joinother * kstream-jointhis * atl映射的原始传感器数据分区 * atl映射的错误分数数据分区 * atl-joined-data-partition

这个想法是将原始传感器数据与错误分数结合在一起。密钥更新需要atl-mapped- *主题。那么该联接将使用kstream *主题,最后,作为联接的结果,将填充atl-joined-data-repartition。之后,聚合也创建了主题,但我现在将其排除在范围之外。

original-sensor-data 
\
 \
  \   atl-mapped-original-sensor-data-repartition-- kstream-jointhis -\
  /   atl-mapped-error-score-data-repartition    -- kstream-joinother -\
 /                                                                      \ 
error-score                                          atl-joined-data-repartition 

由于自从我介绍了join和atl映射的主题以来,增加实例数似乎不再具有太大的影响,我想知道这种拓扑是否有可能成为其自身的瓶颈。从消费者的滞后来看,与atl-mapped- *主题相比,原始传感器数据和错误得分主题的消费者滞后似乎要小得多。是否有办法通过删除这些变更日志来解决此问题,或者是否导致无法扩展?

0 个答案:

没有答案