Question

尝试将大约5万条消息加载到KAFKA主题中。在少数运行的开始，但并非总是如此。

org.apache.kafka.common.KafkaException：无法执行事务方法，因为我们处于错误状态
在org.apache.kafka.clients.producer.internals.TransactionManager.maybeFailWithError（TransactionManager.java:784）〜[kafka-clients-2.0.0.jar :?]
在org.apache.kafka.clients.producer.internals.TransactionManager.beginAbort（TransactionManager.java:229）〜[kafka-clients-2.0.0.jar :?]
在org.apache.kafka.clients.producer.KafkaProducer.abortTransaction（KafkaProducer.java:679）〜[kafka-clients-2.0.0.jar :?]
在myPackage.persistUpdatesPostAction（MyCode.java：??）〜[aKafka.jar :?]
...
由以下原因引起：org.apache.kafka.common.errors.ProducerFencedException：生产者尝试使用旧时代进行操作。可能有一个具有相同transactionalId的更新的生产者，或者该生产者的交易已被代理终止。

Code Block is below:  
--------------------  
public void persistUpdatesPostAction(List<Message> messageList )
{
    if ((messageList == null) || (messageList.isEmpty())) 
    {
        return;
    }
    logger.createDebug("Messages in batch(postAction) : "+ messageList.size());
    Producer<String,String> producer = KafkaUtils.getProducer(Thread.currentThread().getName());
    try
    {
        producer.beginTransaction();
        createKafkaBulkInsert1(producer, messageList, "Topic1");
        createKafkaBulkInsert2(producer, messageList, "Topic2");
        createKafkaBulkInsert3(producer, messageList, "Topic3");
        producer.commitTransaction();
    }
    catch (Exception e) {
        producer.abortTransaction();
        producer.close();
        KafkaUtils.removeProducer(Thread.currentThread().getName());
    }
}

-----------

static Properties setPropertiesProducer()
{
    Properties temp = new Properties();
    temp.put("bootstrap.servers", "localhost:9092");
    temp.put("acks", "all");
    temp.put("retries", 1);
    temp.put("batch.size", 16384);
    temp.put("linger.ms", 5);
    temp.put("buffer.memory", 33554432);
    temp.put("key.serializer",   "org.apache.kafka.common.serialization.StringSerializer");
    temp.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
    return temp;
}

public static Producer<String, String> getProducer(String aThreadId)
{
    if ((producerMap.size() == 0) || (producerMap.get(aThreadId) == null))
    {
        Properties temp = producerProps;
        temp.put("transactional.id", aThreadId);
        Producer<String, String> producer = new KafkaProducer<String, String>(temp);
        producerMap.put(aThreadId, producer);
        producer.initTransactions();
        return producer;
    }
    return producerMap.get(aThreadId);
}

public static void removeProducer(String aThreadId)
{
    logger.createDebug("Removing Thread ID :" + aThreadId);
    if (producerMap.get(aThreadId) == null)
        return;
    producerMap.remove(aThreadId);
}
---------------

Answer 1

我的生产者初始化代码中存在竞争条件。我已通过将Producer映射更改为ConcurrentHashMap类型来解决问题，以确保线程安全。

Answer 2

由以下原因引起：org.apache.kafka.common.errors.ProducerFencedException：生产者尝试以旧时代进行手术。要么有一个新的生产商相同的transactionalId，或生产者的交易已被过期经纪人。

此异常消息不是很有帮助。我相信，要说经纪人不再具有客户发送的交易ID的任何记录，这是尝试。这可能是因为：

其他人正在使用相同的transaction-id并已提交。以我的经验，除非您在客户端之间共享事务ID，否则这种可能性较小。我们使用UUID.randomUUID()确保我们的ID是唯一的。
交易超时，并被经纪人自动化删除。

在我们的例子中，我们经常会因事务超时而产生该异常。有2个属性可控制经纪人在中止交易并忘记交易之前将其记住的时间。

transaction.max.timeout.ms-一个 broker 属性，用于指定直到中止和忘记事务之前的最大毫秒数。许多Kafka版本的默认值似乎是900000（15分钟）。 Documentation from Kafka说：

允许的最大事务超时。如果客户请求的交易时间超过了该时间，则经纪人将在InitProducerIdRequest中返回错误。这样可以防止客户的超时时间过长，从而使消费者无法阅读交易中包含的主题。
transaction.timeout.ms-一个生产者客户端属性，用于设置创建事务时的超时时间（以毫秒为单位）。许多Kafka版本的默认值似乎是60000（1分钟）。卡夫卡的文件说：

在主动中止正在进行的交易之前，事务协调器将等待生产者更新事务状态的最长时间（以毫秒为单位）。

如果客户端中设置的transaction.timeout.ms属性超过了代理中的transaction.max.timeout.ms属性，则生产者将立即抛出类似以下异常的信息：

org.apache.kafka.common.KafkaException: Unexpected error in InitProducerIdResponse
The transaction timeout is larger than the maximum value allowed by the broker 
(as configured by transaction.max.timeout.ms).

Answer 3

我编写了一个单元测试来重现这一点，从这段Java代码中，您可以很容易地理解两个相同的交互ID是如何发生的。

  @Test
  public void SendOffset_TwoProducerDuplicateTrxId_ThrowException() {
    // create two producer with same transactional id
    Producer producer1 = KafkaBuilder.buildProducer(trxId, servers);
    Producer producer2 = KafkaBuilder.buildProducer(trxId, servers);

    offsetMap.put(new TopicPartition(topic, 0), new OffsetAndMetadata(1000));

    // initial and start two transactions
    sendOffsetBegin(producer1);
    sendOffsetBegin(producer2);

    try {
      // when commit first transaction it expected to throw exception
      sendOffsetEnd(producer1);

      // it expects not run here
      Assert.assertTrue(false);
    } catch (Throwable t) {
      // it expects to catch the exception
      Assert.assertTrue(t instanceof ProducerFencedException);
    }
  }

  private void sendOffsetBegin(Producer producer) {
    producer.initTransactions();
    producer.beginTransaction();
    producer.sendOffsetsToTransaction(offsetMap, consumerGroup);
  }

  private void sendOffsetEnd(Producer producer) {
    producer.commitTransaction();
  }

Answer 4

运行应用程序的多个实例时，transactional.id 当满足击剑僵尸时，在所有实例上必须相同在侦听器容器线程上生成记录。但是，当使用由非发起的交易产生记录侦听器容器，每个实例的前缀必须不同。

https://docs.spring.io/spring-kafka/reference/html/#transaction-id-prefix

卡夫卡-在producer.send期间获取ProducerFencedException的原因是什么

4 个答案: