为什么我不能看到项目作者制作的Reactive Kafa的高TPS(交易/秒)表现?
此代码源自反应性kafka项目中的基准代码,是在单分区主题中填充的2M记录。跑步时我得到约140K的TPS。不是很糟糕,但远远超过了希望的1000的100个。
我最关心的是这是一个只有1分区的主题,这真的不是真正的测试用例。
case class RunTest4(msgCount: Int, producer: com.foo.Producer, kafkaHost: String, groupId: String, topic: String)(implicit system: ActorSystem) {
// Pre-populate a topic w/some records (2 million)
producer.populate(msgCount, topic)
Thread.sleep(2000)
partitionInfo(topic)
val partitionTarget = msgCount - 1
val settings = ConsumerSettings(system, new ByteArrayDeserializer, new StringDeserializer)
.withBootstrapServers(kafkaHost)
.withGroupId(groupId)
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
def consumerAtLeastOnceBatched(batchSize: Int)(implicit mat: Materializer): Unit = {
val promise = Promise[Unit]
val control = Consumer.committableSource(settings, Subscriptions.topics(topic))
.map {
msg => msg.committableOffset
}
.batch(batchSize.toLong, first => CommittableOffsetBatch.empty.updated(first)) { (batch, elem) =>
batch.updated(elem)
}
.mapAsync(3) { m =>
m.commitScaladsl().map(_ => m)(ExecutionContexts.sameThreadExecutionContext)
}
.toMat(Sink.foreach { batch =>
if (batch.offsets().head._2 >= partitionTarget)
promise.complete(Success(()))
})(Keep.left)
.run()
println("Control is: " + control.getClass.getName)
val now = System.currentTimeMillis()
Await.result(promise.future, 30.seconds)
val later = System.currentTimeMillis()
println("TPS: " + (msgCount / ((later - now) / 1000.0)))
control.shutdown()
groupInfo(groupId)
}
private def partitionInfo(topic: String) =
kafka.tools.GetOffsetShell.main(Array("--topic", topic, "--broker-list", kafkaHost, "--time", "-1"))
private def groupInfo(group: String) =
kafka.admin.ConsumerGroupCommand.main(Array("--describe", "--group", group, "--bootstrap-server", kafkaHost, "--new-consumer"))
}
这个测试(我希望)是一个处理每个主题多个分区的好方法 - 更现实的情况。当我运行批量大小为10,000并且主题w / 2M记录填充4个主题分区时,我的测试超时等待30秒,这意味着无论何时完成它都会有TPS <67K(2M / 30) )......真的不太好。 (这个测试将以较小的记录人口成功,但这不是测试!)
(作为参考,我的LateKafka项目(产生一个来源),确实是骨骼,在相同的测试中达到300K TPS以上,并且在我的笔记本电脑上使用原生KafkaConsumer约为500K。)
case class RunTest3(msgCount: Int, producer: com.foo.Producer, kafkaHost: String, groupId: String, topic: String)(implicit system: ActorSystem) {
// Pre-populate a topic w/some records (2 million)
producer.populate(msgCount, topic)
Thread.sleep(2000)
partitionInfo(topic)
val partitionTarget = msgCount - 1
val settings = ConsumerSettings(system, new ByteArrayDeserializer, new StringDeserializer)
.withBootstrapServers(kafkaHost)
.withGroupId(groupId)
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
def consumerAtLeastOnceBatched(batchSize: Int)(implicit mat: Materializer): Unit = {
val promise = Promise[Unit]
val control = Consumer.committablePartitionedSource(settings, Subscriptions.topics(topic))
.flatMapMerge(4, _._2)
.map {
msg => msg.committableOffset
}
.batch(batchSize.toLong, first => CommittableOffsetBatch.empty.updated(first)) { (batch, elem) =>
batch.updated(elem)
}
.mapAsync(3) { m =>
m.commitScaladsl().map(_ => m)(ExecutionContexts.sameThreadExecutionContext)
}
.toMat(Sink.foreach { batch =>
if (batch.offsets().head._2 >= partitionTarget)
promise.complete(Success(()))
})(Keep.left)
.run()
println("Control is: " + control.getClass.getName)
val now = System.currentTimeMillis()
Await.result(promise.future, 30.seconds)
val later = System.currentTimeMillis()
println("TPS: " + (msgCount / ((later - now) / 1000.0)))
control.shutdown()
groupInfo(groupId)
}
private def partitionInfo(topic: String) =
kafka.tools.GetOffsetShell.main(Array("--topic", topic, "--broker-list", kafkaHost, "--time", "-1"))
private def groupInfo(group: String) =
kafka.admin.ConsumerGroupCommand.main(Array("--describe", "--group", group, "--bootstrap-server", kafkaHost, "--new-consumer"))
}
这些预期结果或我的测试代码有问题吗?