使用持久性键值存储的Kafka Stream应用程序的高消费者滞后

时间:2018-05-18 15:24:13

标签: java performance optimization apache-kafka apache-kafka-streams

我使用状态存储的Streams应用程序并没有像我想象的那样获得尽可能多的吞吐量,而且我的消费者滞后期也在增加。我想知道是否有任何明显的配置我可以调整或任何可以帮助优化我的吞吐量。

在我的主要内容中我有这个属性:

public static void main(String[] args) {
    final Properties config = new Properties();
    config.put(StreamsConfig.APPLICATION_ID_CONFIG, "reading-stream");
    config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, KAFKA_URL);
    config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    config.put(ConsumerConfig.FETCH_MAX_BYTES_CONFIG, "52428800");
    config.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 0);

    final MyStream stream = new MyStream(config);
    stream.start();

    Runtime.getRuntime().addShutdownHook(new Thread(stream::close));
}

在MyStream.start()中是这段代码:

final Serde readingSerde = Utils.createSerde(Reading.class);
final Serde groupSerde = Utils.createSerde(Group.class);
final Serde checkStateSerde = Utils.createSerde(CheckState.class);
final StreamsBuilder builder = new StreamsBuilder();

final StoreBuilder storeSupplier =
    Stores.keyValueStoreBuilder(
        Stores.persistentKeyValueStore(STORE_STATE),
        Serdes.String(),
        checkStateSerde
    ).withLoggingEnabled(new HashMap());
builder.addStateStore(storeSupplier);

final KTable<String, Group> group = builder
    .table("group-topic",
        Consumed.with(Serdes.String(), groupSerde)
    );

final KStream<String, Reading> readings = builder
    .stream("reading-topic",
        Consumed.with(Serdes.String(), readingSerde)
            .withOffsetResetPolicy(Topology.AutoOffsetReset.LATEST));

readings
    .join(group,
        ....
    ).flatMap
        ....
        }).process(() -> new MyProcessor(), STORE_STATE);

final Topology topology = builder.build();
streams = new KafkaStreams(topology, this.config);
streams.start();

还要考虑在AWS上使用不同类型的硬盘(现在使用gp2)......

有哪些其他想法可以帮助优化此代码?

0 个答案:

没有答案