2019年8月-卡夫卡消费者滞后程序

时间:2019-08-01 05:19:36

标签: apache-kafka spring-kafka kafka-python

有什么方法可以通过编程方式找到卡夫卡消费者中的滞后。 我不希望外部Kafka Manager工具安装并在仪表板上检查。

我们可以列出所有消费者组并检查每个组的滞后时间。

当前,我们确实有检查滞后的命令,它需要Kafka所在的相对路径。

Spring-Kafka,kafka-python,Kafka Admin客户端或使用JMX-有什么方法可以编码并找出滞后时间。

我们很粗心,没有监控流程,消费者处于僵尸状态,延迟达到50,000,造成了很多混乱。

仅当问题出现时,我们在监视脚本时会想到这些情况,但不知道会导致僵尸进程。

任何想法都受到欢迎!

5 个答案:

答案 0 :(得分:2)

您可以使用kafka-python来获取此文件,请确保在活动控制器中运行此文件,这会给所有主题分区造成用户延迟。

from kafka.admin.client import KafkaAdminClient
from kafka.protocol.group import MemberAssignment
from kafka import KafkaConsumer, TopicPartition

client = KafkaAdminClient(bootstrap_servers=BOOTSTRAP_SERVERS, request_timeout_ms=300)
list_groups_request  = client.list_consumer_groups()

for group in list_groups_request:
  if group[1] == 'consumer':
    list_mebers_in_groups = client.describe_consumer_groups([group[0]])
    (error_code, group_id, state, protocol_type, protocol, members) = list_mebers_in_groups[0]

    if len(members) !=0:
      for member in members:
         (member_id, client_id, client_host, member_metadata, member_assignment) = member
         member_topics_assignment = []
         for (topic, partitions) in MemberAssignment.decode(member_assignment).assignment:
                member_topics_assignment.append(topic)

              for topic in member_topics_assignment:
                consumer = KafkaConsumer(
                          bootstrap_servers=BOOTSTRAP_SERVERS,
                          group_id=group[0],
                          enable_auto_commit=False
                          )
                consumer.topics()

                for p in consumer.partitions_for_topic(topic):
                  tp = TopicPartition(topic, p)
                  consumer.assign([tp])
                  committed = consumer.committed(tp)
                  consumer.seek_to_end(tp)
                  last_offset = consumer.position(tp)
                  if last_offset != None and committed != None:
                    lag = last_offset - committed
                    print "group: {} topic:{} partition: {} lag: {}".format(group[0], topic, p, lag)

答案 1 :(得分:1)

是的。我们可以在kafka-python中获得消费者的滞后。不知道这是否是最好的方法。但这有效。

当前,我们是手动提供消费者,您也可以从kafka-python中获得消费者,但是它仅提供活动消费者的列表。因此,如果您的一位消费者失望了。它可能不会显示在列表中。

首先建立客户端连接

from kafka import BrokerConnection
from kafka.protocol.commit import *
import socket

#This takes in only one broker at a time. So to use multiple brokers loop through each one by giving broker ip and port.

def establish_broker_connection(server, port, group):
    '''
    Client Connection to each broker for getting consumer offset info
    '''
    bc = BrokerConnection(server, port, socket.AF_INET)
    bc.connect_blocking()
    fetch_offset_request = OffsetFetchRequest_v3(group, None)
    future = bc.send(fetch_offset_request)

接下来,我们需要获取订阅用户的每个主题的当前偏移量。通过上面的未来,并在这里BC。

from kafka import SimpleClient
from kafka.protocol.offset import OffsetRequest, OffsetResetStrategy
from kafka.common import OffsetRequestPayload

def _get_client_connection():
    '''
    Client Connection to the cluster for getting topic info
    '''
    # Give comma seperated info of kafka broker "broker1:port1, broker2:port2'
    client = SimpleClient(BOOTSTRAP_SEREVRS)
    return client

def get_latest_offset_for_topic(self, topic):
    '''
    To get latest offset for a topic
    '''
    partitions = self.client.topic_partitions[topic]
    offset_requests = [OffsetRequestPayload(topic, p, -1, 1) for p in partitions.keys()]
    client = _get_client_connection()
    offsets_responses = client.send_offset_request(offset_requests)
    latest_offset = offsets_responses[0].offsets[0]
    return latest_offset # Gives latest offset for topic

def get_current_offset_for_consumer_group(future, bc):
    '''
    Get current offset info for a consumer group
    '''
    while not future.is_done:
        for resp, f in bc.recv():
            f.success(resp)

    # future.value.topics -- This will give all the topics in the form of a list.
    for topic in self.future.value.topics:
        latest_offset = self.get_latest_offset_for_topic(topic[0])
        for partition in topic[1]:
            offset_difference = latest_offset - partition[1]

offset_difference给出了主题中产生的最后一个偏移量与消费者消耗的最后一个偏移量(或消息)之间的差值。

如果您没有获得某个主题的消费者的当前偏移量,则表明您的消费者可能已经失望了。

因此,如果偏移量差异超出所需的阈值或您的消费者得到空的偏移量,则可以发出警报或发送邮件。

答案 2 :(得分:1)

Java客户端通过JMX为其消费者揭露了滞后;在此示例中,我们有5个分区...

enter image description here

Spring Boot可以将其发布到千分尺上。

答案 3 :(得分:0)

我在Scala中编写代码,但仅使用KafkaConsumerKafkaProducer中的本机Java API。

您只需要知道消费者组和主题的名称。 可以避免使用预定义的主题,但是您将仅对存在的状态为stable的Consumer Group感到滞后,这可能会引起警报。 因此,您真正需要了解和使用的全部是:

  1. KafkaConsumer.commited-返回TopicPartition的最新提交的偏移量
  2. KafkaConsumer.assign-不要使用订阅,因为这会导致CG重新平衡。您绝对不希望您的监视过程影响监视主题。
  3. kafkaConsumer.endOffsets-返回最新产生的偏移量
  4. Consumer Group Lag-是最新提交的和最新制作的
import java.util.{Properties, UUID}

import org.apache.kafka.clients.consumer.KafkaConsumer
import org.apache.kafka.clients.producer.KafkaProducer
import org.apache.kafka.common.TopicPartition
import org.apache.kafka.common.serialization.{StringDeserializer, StringSerializer}

import scala.collection.JavaConverters._
import scala.util.Try

case class TopicPartitionInfo(topic: String, partition: Long, currentPosition: Long, endOffset: Long) {
  val lag: Long = endOffset - currentPosition

  override def toString: String = s"topic=$topic,partition=$partition,currentPosition=$currentPosition,endOffset=$endOffset,lag=$lag"
}

case class ConsumerGroupInfo(consumerGroup: String, topicPartitionInfo: List[TopicPartitionInfo]) {
  override def toString: String = s"ConsumerGroup=$consumerGroup:\n${topicPartitionInfo.mkString("\n")}"
}

object ConsumerLag {

  def consumerGroupInfo(bootStrapServers: String, consumerGroup: String, topics: List[String]) = {
    val properties = new Properties()
    properties.put("bootstrap.servers", bootStrapServers)
    properties.put("auto.offset.reset", "latest")
    properties.put("group.id", consumerGroup)
    properties.put("key.deserializer", classOf[StringDeserializer])
    properties.put("value.deserializer", classOf[StringDeserializer])
    properties.put("key.serializer", classOf[StringSerializer])
    properties.put("value.serializer", classOf[StringSerializer])
    properties.put("client.id", UUID.randomUUID().toString)

    val kafkaProducer = new KafkaProducer[String, String](properties)
    val kafkaConsumer = new KafkaConsumer[String, String](properties)
    val assignment = topics
      .map(topic => kafkaProducer.partitionsFor(topic).asScala)
      .flatMap(partitions => partitions.map(p => new TopicPartition(p.topic, p.partition)))
      .asJava
    kafkaConsumer.assign(assignment)

    ConsumerGroupInfo(consumerGroup,
      kafkaConsumer.endOffsets(assignment).asScala
        .map { case (tp, latestOffset) =>
          TopicPartitionInfo(tp.topic,
            tp.partition,
            Try(kafkaConsumer.committed(tp)).map(_.offset).getOrElse(0), // TODO Warn if Null, Null mean Consumer Group not exist
            latestOffset)
        }
        .toList
    )

  }

  def main(args: Array[String]): Unit = {
    println(
      consumerGroupInfo(
        bootStrapServers = "kafka-prod:9092",
        consumerGroup = "not-exist",
        topics = List("events", "anotherevents")
      )
    )

    println(
      consumerGroupInfo(
        bootStrapServers = "kafka:9092",
        consumerGroup = "consumerGroup1",
        topics = List("events", "anotehr events")
      )
    )
  }
}

答案 4 :(得分:0)

如果有人在融合云中寻找消费者滞后,这里是一个简单的脚本

BOOTSTRAP_SERVERS = "<>.aws.confluent.cloud"
CCLOUD_API_KEY = "{{ ccloud_apikey }}"
CCLOUD_API_SECRET = "{{ ccloud_apisecret }}"
ENVIRONMENT = "dev"
CLUSTERID = "dev"
CACERT = "/usr/local/lib/python{{ python3_version }}/site-packages/certifi/cacert.pem"

def main():

  client = KafkaAdminClient(bootstrap_servers=BOOTSTRAP_SERVERS,
                            ssl_cafile=CACERT,
                            security_protocol='SASL_SSL',
                            sasl_mechanism='PLAIN',
                            sasl_plain_username=CCLOUD_API_KEY,
                            sasl_plain_password=CCLOUD_API_SECRET)

  for group in client.list_consumer_groups():
    if group[1] == 'consumer':
      list_members_in_groups =  client.list_consumer_group_offsets(group[0])
      for (topic,partition) in list_members_in_groups:

        consumer = KafkaConsumer(
                                   bootstrap_servers=BOOTSTRAP_SERVERS,
                                   ssl_cafile=CACERT,
                                   group_id=group[0],
                                   enable_auto_commit=False,
                                   api_version=(0,10),
                                   security_protocol='SASL_SSL',
                                   sasl_mechanism='PLAIN',
                                   sasl_plain_username=CCLOUD_API_KEY,
                                   sasl_plain_password=CCLOUD_API_SECRET
                                 )
        consumer.topics()

        tp = TopicPartition(topic, partition)
        consumer.assign([tp])
        committed = consumer.committed(tp)
        consumer.seek_to_end(tp)
        last_offset = consumer.position(tp)
        if last_offset != None and committed != None:
          lag = last_offset - committed
          print("group: {} topic:{} partition: {} lag: {}".format(group[0], topic, partition, lag))
        consumer.close(autocommit=False)