测试Kafka HA并获取NetworkException:在收到响应之前服务器已断开连接

时间:2019-04-04 18:39:31

标签: apache-kafka

运行Confluent Kafka 4.1.1社区。

我有...

  • 最小同步副本= 2
  • 主题:1个分区,副本计数3
  • 总共3个经纪人。
  • 生产者设置为acks = -1
  • 所有其他生产者设置均为默认设置。

我启动了我的应用程序,并开始向Kafka写记录。我故意破坏了其中一位经纪人,我立即得到:org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received.

基于以上设置。生产者write()是否应该成功并且不会抛出错误?

说明

  • 我故意杀死经纪人
  • 这似乎仅在领导经纪人被杀时发生吗?

2 个答案:

答案 0 :(得分:1)

没有看到完整的配置。和记录消息,仍然很难说。.

在Kafka中,所有写入都通过领导者分区。在您的设置中,您杀死了3个经纪人中的1个。因此,应该有可能成功地向其余2个经纪人写信并获得确认。但是,如果被杀死的代理是领导者节点,则可能导致异常。

从文档中

  

acks = all这意味着领导者将等待全套同步中   副本以确认记录。这样可以保证记录   只要至少保留一个同步副本,就不会丢失   活。这是最有力的保证。

在任何情况下,您都可以将重试设置为大于0的值并查看其行为-应该选出新的领导者,并且您的写入最终应该成功

答案 1 :(得分:1)

对于用于Kafka活页夹的Spring Cloud Stream 对于 用于Kafka的Azure Eventhub
例外:

{"timestamp":"2020-09-23 23:37:18.541","level":"ERROR","class":"org.springframework.kafka.support.LoggingProducerListener.onError 84", "thread":"kafka-producer-network-thread | producer-2","traceId":"","message":Exception thrown when sending a message with key='null' and payload='{123, 34, 115, 116, 97, 116, 117, 115, 34, 58, 34, 114, 101, 97, 100, 121, 34, 44, 34, 101, 118, 101...' to topic executor-networkexception and partition 3:}
org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received.

{"timestamp":"2020-09-23 23:37:18.545","level":"WARN ","class":"org.apache.kafka.clients.producer.internals.Sender.completeBatch 568", "thread":"kafka-producer-network-thread | producer-2","traceId":"","message":[Producer clientId=producer-2] Received invalid metadata error in produce request on partition executor-networkexception-3 due to org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received.. Going to request metadata update now}

解决方案:设置空闲时间,重试次数,重试退避时间-

spring:
  cloud:
    stream:
      kafka:
        binder:
          brokers: srsmvsdneventhubstage.servicebus.windows.net:9093
          configuration:
            sasl.jaas.config: 'org.apache.kafka.common.security.plain.PlainLoginModulerequiredusername="$ConnectionString"password="Endpoint=sb://xxxxx.servicebus.windows.net/;=";'
            sasl.mechanism: PLAIN
            security.protocol: SASL_SSL
            retries: 3
            retry.backoff.ms: 60
            connections.max.idle.ms: 240000

参考:

http://kafka.apache.org/090/documentation.html (read http://kafka.apache.org/090/documentation.html#producerconfigs

    [https://github.com/Azure/azure-event-hubs-for-kafka/blob/master/CONFIGURATION.md][2]  (read connections.max.idle.ms)

日志:

"org.apache.kafka.clients.producer.ProducerConfig.logAll 279", "thread":"hz._hzInstance_1_dev.cached.thread-14","traceId":"","message":ProducerConfig values: 
        acks = 1
        batch.size = 16384
        bootstrap.servers = [srsmvsdneventhubstage.servicebus.windows.net:9093]
        buffer.memory = 33554432
        client.id = 
        compression.type = none
        connections.max.idle.ms = 540000
        enable.idempotence = false
        interceptor.classes = []
        key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
        linger.ms = 0
        max.block.ms = 60000
        max.in.flight.requests.per.connection = 5
        max.request.size = 1048576
        metadata.max.age.ms = 300000
        metric.reporters = []
        metrics.num.samples = 2
        metrics.recording.level = INFO
        metrics.sample.window.ms = 30000
        partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
        receive.buffer.bytes = 32768
        reconnect.backoff.max.ms = 1000
        reconnect.backoff.ms = 50
        request.timeout.ms = 30000
        retries = 0
        retry.backoff.ms = 100
        sasl.client.callback.handler.class = null
        sasl.jaas.config = [hidden]
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.min.time.before.relogin = 60000
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        sasl.kerberos.ticket.renew.window.factor = 0.8
        sasl.login.callback.handler.class = null
        sasl.login.class = null
        sasl.login.refresh.buffer.seconds = 300
        sasl.login.refresh.min.period.seconds = 60
        sasl.login.refresh.window.factor = 0.8
        sasl.login.refresh.window.jitter = 0.05
        sasl.mechanism = PLAIN
        security.protocol = SASL_SSL
        send.buffer.bytes = 131072
        ssl.cipher.suites = null
        ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
        ssl.endpoint.identification.algorithm = https
        ssl.key.password = null
        ssl.keymanager.algorithm = SunX509
        ssl.keystore.location = null
        ssl.keystore.password = null
        ssl.keystore.type = JKS
        ssl.protocol = TLS
        ssl.provider = null
        ssl.secure.random.implementation = null
        ssl.trustmanager.algorithm = PKIX
        ssl.truststore.location = null
        ssl.truststore.password = null
        ssl.truststore.type = JKS
        transaction.timeout.ms = 60000
        transactional.id = null
        value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
    }

新-

"org.apache.kafka.clients.producer.ProducerConfig.logAll 279", "thread":"hz._hzInstance_1_dev.cached.thread-20","traceId":"","message":ProducerConfig values: 
            acks = 1
            batch.size = 16384
            bootstrap.servers = [xxxxx.servicebus.windows.net:9093]
            buffer.memory = 33554432
            client.id = 
            compression.type = none
            **connections.max.idle.ms = 240000**
            enable.idempotence = false
            interceptor.classes = []
            key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
            linger.ms = 0
            max.block.ms = 60000
            max.in.flight.requests.per.connection = 5
            max.request.size = 1048576
            metadata.max.age.ms = 300000
            metric.reporters = []
            metrics.num.samples = 2
            metrics.recording.level = INFO
            metrics.sample.window.ms = 30000
            partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
            receive.buffer.bytes = 32768
            reconnect.backoff.max.ms = 1000
            reconnect.backoff.ms = 50
            request.timeout.ms = 30000
            **retries = 3**
            **retry.backoff.ms = 60**
            sasl.client.callback.handler.class = null
            sasl.jaas.config = [hidden]
            sasl.kerberos.kinit.cmd = /usr/bin/kinit
            sasl.kerberos.min.time.before.relogin = 60000
            sasl.kerberos.service.name = null
            sasl.kerberos.ticket.renew.jitter = 0.05
            sasl.kerberos.ticket.renew.window.factor = 0.8
            sasl.login.callback.handler.class = null
            sasl.login.class = null
            sasl.login.refresh.buffer.seconds = 300
            sasl.login.refresh.min.period.seconds = 60
            sasl.login.refresh.window.factor = 0.8
            sasl.login.refresh.window.jitter = 0.05
            sasl.mechanism = PLAIN
            security.protocol = SASL_SSL
            send.buffer.bytes = 131072
            ssl.cipher.suites = null
            ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
            ssl.endpoint.identification.algorithm = https
            ssl.key.password = null
            ssl.keymanager.algorithm = SunX509
            ssl.keystore.location = null
            ssl.keystore.password = null
            ssl.keystore.type = JKS
            ssl.protocol = TLS
            ssl.provider = null
            ssl.secure.random.implementation = null
            ssl.trustmanager.algorithm = PKIX
            ssl.truststore.location = null
            ssl.truststore.password = null
            ssl.truststore.type = JKS
            transaction.timeout.ms = 60000
            transactional.id = null
            value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
        }