为什么我看不到Reactive Kafka的高性能? (0.11发布)

时间:2016-09-21 13:22:28

标签: akka

为什么我不能看到项目作者制作的Reactive Kafa的高TPS(交易/秒)表现?

此代码源自反应性kafka项目中的基准代码,是在单分区主题中填充的2M记录。跑步时我得到约140K的TPS。不是很糟糕,但远远超过了希望的1000的100个。

我最关心的是这是一个只有1分区的主题,这真的不是真正的测试用例。

case class RunTest4(msgCount: Int, producer: com.foo.Producer, kafkaHost: String, groupId: String, topic: String)(implicit system: ActorSystem) {

  // Pre-populate a topic w/some records (2 million)
  producer.populate(msgCount, topic)
  Thread.sleep(2000)
  partitionInfo(topic)
  val partitionTarget = msgCount - 1

  val settings = ConsumerSettings(system, new ByteArrayDeserializer, new StringDeserializer)
    .withBootstrapServers(kafkaHost)
    .withGroupId(groupId)
    .withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")

  def consumerAtLeastOnceBatched(batchSize: Int)(implicit mat: Materializer): Unit = {
    val promise = Promise[Unit]
    val control = Consumer.committableSource(settings, Subscriptions.topics(topic))
      .map {
        msg => msg.committableOffset
      }
      .batch(batchSize.toLong, first => CommittableOffsetBatch.empty.updated(first)) { (batch, elem) =>
        batch.updated(elem)
      }
      .mapAsync(3) { m =>
        m.commitScaladsl().map(_ => m)(ExecutionContexts.sameThreadExecutionContext)
      }
      .toMat(Sink.foreach { batch =>
        if (batch.offsets().head._2 >= partitionTarget)
          promise.complete(Success(()))
      })(Keep.left)
      .run()

    println("Control is: " + control.getClass.getName)
    val now = System.currentTimeMillis()
    Await.result(promise.future, 30.seconds)
    val later = System.currentTimeMillis()
    println("TPS: " + (msgCount / ((later - now) / 1000.0)))
    control.shutdown()

    groupInfo(groupId)
  }

  private def partitionInfo(topic: String) =
    kafka.tools.GetOffsetShell.main(Array("--topic", topic, "--broker-list", kafkaHost, "--time", "-1"))
  private def groupInfo(group: String) =
    kafka.admin.ConsumerGroupCommand.main(Array("--describe", "--group", group, "--bootstrap-server", kafkaHost, "--new-consumer"))

}

这个测试(我希望)是一个处理每个主题多个分区的好方法 - 更现实的情况。当我运行批量大小为10,000并且主题w / 2M记录填充4个主题分区时,我的测试超时等待30秒,这意味着无论何时完成它都会有TPS <67K(2M / 30) )......真的不太好。 (这个测试将以较小的记录人口成功,但这不是测试!)

(作为参考,我的LateKafka项目(产生一个来源),确实是骨骼,在相同的测试中达到300K TPS以上,并且在我的笔记本电脑上使用原生KafkaConsumer约为500K。)

case class RunTest3(msgCount: Int, producer: com.foo.Producer, kafkaHost: String, groupId: String, topic: String)(implicit system: ActorSystem) {

  // Pre-populate a topic w/some records (2 million)
  producer.populate(msgCount, topic)
  Thread.sleep(2000)
  partitionInfo(topic)
  val partitionTarget = msgCount - 1

  val settings = ConsumerSettings(system, new ByteArrayDeserializer, new StringDeserializer)
    .withBootstrapServers(kafkaHost)
    .withGroupId(groupId)
    .withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")

  def consumerAtLeastOnceBatched(batchSize: Int)(implicit mat: Materializer): Unit = {
    val promise = Promise[Unit]
    val control = Consumer.committablePartitionedSource(settings, Subscriptions.topics(topic))
      .flatMapMerge(4, _._2)
      .map {
        msg => msg.committableOffset
      }
      .batch(batchSize.toLong, first => CommittableOffsetBatch.empty.updated(first)) { (batch, elem) =>
        batch.updated(elem)
      }
      .mapAsync(3) { m =>
        m.commitScaladsl().map(_ => m)(ExecutionContexts.sameThreadExecutionContext)
      }
      .toMat(Sink.foreach { batch =>
        if (batch.offsets().head._2 >= partitionTarget)
          promise.complete(Success(()))
      })(Keep.left)
      .run()

    println("Control is: " + control.getClass.getName)
    val now = System.currentTimeMillis()
    Await.result(promise.future, 30.seconds)
    val later = System.currentTimeMillis()
    println("TPS: " + (msgCount / ((later - now) / 1000.0)))
    control.shutdown()

    groupInfo(groupId)
  }

  private def partitionInfo(topic: String) =
    kafka.tools.GetOffsetShell.main(Array("--topic", topic, "--broker-list", kafkaHost, "--time", "-1"))
  private def groupInfo(group: String) =
    kafka.admin.ConsumerGroupCommand.main(Array("--describe", "--group", group, "--bootstrap-server", kafkaHost, "--new-consumer"))

}

这些预期结果或我的测试代码有问题吗?

0 个答案:

没有答案