Kafka Streams-使用者内存过载

时间:2019-01-07 10:15:09

标签: apache-kafka apache-kafka-streams spring-kafka

我正在计划一个Spring + Kafka Streams应用程序,该应用程序处理传入消息并存储这些消息导致的更新内部状态。 预计该状态将达到每个唯一密钥约500mb(可能在2k分区中分布约1万个唯一密钥)。

为了使应用程序有效运行,通常必须将该状态保留在内存中,但是即使在磁盘上,我仍然会遇到类似的问题(尽管只是在以后扩展时)。

我计划将此应用程序部署到动态扩展环境(例如AWS)中,并将设置最少的实例数,但是我对以下两种情况保持警惕:

  • 在第一次启动(也许只有一个使用者首先启动)时,它将无法处理所有分区的分配,因为处于内存状态会使实例可用内存溢出。
  • 在发生严重故障(AWS可用区故障)之后,可能有33%的使用者被带离了该组,而其余实例上的额外内存负载实际上可以将剩下的每个人都带走。

人们如何保护消费者不要使用超出其处理能力的更多分区,以使他们不会溢出可用的内存/磁盘?

1 个答案:

答案 0 :(得分:2)

请参见the kafka documentation

从0.11开始...

enter image description here

编辑

对于第二个用例(它也适用于第一个用例),也许您可​​以实现自定义PartitionAssignor来限制分配给每个实例的分区数量。

我还没有尝试过;我不知道经纪人将如何应对未分配的分区。

EDIT2

这似乎行得通;但是YMMV ...

public class NoMoreThanFiveAssignor extends RoundRobinAssignor {

    @Override
    public Map<String, List<TopicPartition>> assign(Map<String, Integer> partitionsPerTopic,
            Map<String, Subscription> subscriptions) {

        Map<String, List<TopicPartition>> assignments = super.assign(partitionsPerTopic, subscriptions);
        assignments.forEach((memberId, assigned) -> {
            if (assigned.size() > 5) {
                System.out.println("Reducing assignments from " + assigned.size() + " to 5 for " + memberId);
                assignments.put(memberId, 
                        assigned.stream()
                            .limit(5)
                            .collect(Collectors.toList()));
            }
        });
        return assignments;
    }

}

@SpringBootApplication
public class So54072362Application {

    public static void main(String[] args) {
        SpringApplication.run(So54072362Application.class, args);
    }

    @Bean
    public NewTopic topic() {
        return new NewTopic("so54072362", 15, (short) 1);
    }

    @KafkaListener(id = "so54072362", topics = "so54072362")
    public void listen(ConsumerRecord<?, ?> record) {
        System.out.println(record);
    }

    @Bean
    public ApplicationRunner runner(KafkaTemplate<String, String> template) {
        return args -> {
            for (int i = 0; i < 15; i++) {
                template.send("so54072362", i, "foo", "bar");
            }
        };
    }

}

spring.kafka.consumer.properties.partition.assignment.strategy=com.example.NoMoreThanFiveAssignor
spring.kafka.consumer.enable-auto-commit=false
spring.kafka.consumer.auto-offset-reset=earliest

Reducing assignments from 15 to 5 for consumer-2-f37221f8-70bb-421d-9faf-6591cc26a76a
2019-01-07 15:24:28.288  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.AbstractCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Successfully joined group with generation 7
2019-01-07 15:24:28.289  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Setting newly assigned partitions [so54072362-0, so54072362-1, so54072362-2, so54072362-3, so54072362-4]
2019-01-07 15:24:28.296  INFO 23485 --- [o54072362-0-C-1] o.s.k.l.KafkaMessageListenerContainer    : partitions assigned: [so54072362-0, so54072362-1, so54072362-2, so54072362-3, so54072362-4]
2019-01-07 15:24:46.303  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.AbstractCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Attempt to heartbeat failed since group is rebalancing
2019-01-07 15:24:46.303  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Revoking previously assigned partitions [so54072362-0, so54072362-1, so54072362-2, so54072362-3, so54072362-4]
2019-01-07 15:24:46.303  INFO 23485 --- [o54072362-0-C-1] o.s.k.l.KafkaMessageListenerContainer    : partitions revoked: [so54072362-0, so54072362-1, so54072362-2, so54072362-3, so54072362-4]
2019-01-07 15:24:46.304  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.AbstractCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] (Re-)joining group
Reducing assignments from 8 to 5 for consumer-2-c9a6928a-520c-4646-9dd9-4da14636744b
Reducing assignments from 7 to 5 for consumer-2-f37221f8-70bb-421d-9faf-6591cc26a76a
2019-01-07 15:24:46.310  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.AbstractCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Successfully joined group with generation 8
2019-01-07 15:24:46.311  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Setting newly assigned partitions [so54072362-9, so54072362-5, so54072362-7, so54072362-1, so54072362-3]
2019-01-07 15:24:46.315  INFO 23485 --- [o54072362-0-C-1] o.s.k.l.KafkaMessageListenerContainer    : partitions assigned: [so54072362-9, so54072362-5, so54072362-7, so54072362-1, so54072362-3]
2019-01-07 15:24:58.324  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.AbstractCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Attempt to heartbeat failed since group is rebalancing
2019-01-07 15:24:58.324  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Revoking previously assigned partitions [so54072362-9, so54072362-5, so54072362-7, so54072362-1, so54072362-3]
2019-01-07 15:24:58.324  INFO 23485 --- [o54072362-0-C-1] o.s.k.l.KafkaMessageListenerContainer    : partitions revoked: [so54072362-9, so54072362-5, so54072362-7, so54072362-1, so54072362-3]
2019-01-07 15:24:58.324  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.AbstractCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] (Re-)joining group
2019-01-07 15:24:58.330  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.AbstractCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Successfully joined group with generation 9
2019-01-07 15:24:58.332  INFO 23485 --- [o54072362-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=so54072362] Setting newly assigned partitions [so54072362-14, so54072362-11, so54072362-5, so54072362-8, so54072362-2]
2019-01-07 15:24:58.336  INFO 23485 --- [o54072362-0-C-1] o.s.k.l.KafkaMessageListenerContainer    : partitions assigned: [so54072362-14, so54072362-11, so54072362-5, so54072362-8, so54072362-2]

当然,这会使未分配的分区悬而未决,但这听起来就是您想要的,直到该区域重新联机为止。