Spring Cloud AWS Kinesis Binder负载平衡

时间:2019-04-01 15:37:51

标签: spring-cloud-stream spring-cloud-aws spring-integration-aws

我正在尝试为AWS Kinesis Stream使用者实现负载平衡

根据我正在尝试实施的文档

spring:
  cloud:
    stream:
      instanceIndex: 1
      instanceCount: 3
      bindings:
        RollUpInboundStream:
          group: my-consumer-group
          destination: my-kinesis-stream
          content-type: application/json

我有 3 个容器,我想在需要时启动新容器(最多6个),而无需重新启动现有容器。

  1. instanceIndex从0或1开始。
  2. 如果我将instanceCount设置为6,但只调出三个实例,则在我调出新实例之前,所有消息都会被消耗掉。
  3. 在文档中,有一个名为spring.cloud.stream.bindings..consumer.concurrency的属性,您能帮忙它的重要性吗?
  4. 由于某些原因,如果任何实例出现故障,那么其中的任何消息都将被消耗掉。

您能帮我们吗

1 个答案:

答案 0 :(得分:1)

spring.cloud.stream.bindings..consumer.concurrency是每个消费者的内部选择:

adapter.setConcurrency(properties.getConcurrency());

...

/**
 * The maximum number of concurrent {@link ConsumerInvoker}s running.
 * The {@link ShardConsumer}s are evenly distributed between {@link ConsumerInvoker}s.
 * Messages from within the same shard will be processed sequentially.
 * In other words each shard is tied with the particular thread.
 * By default the concurrency is unlimited and shard
 * is processed in the {@link #consumerExecutor} directly.
 * @param concurrency the concurrency maximum number
 */
public void setConcurrency(int concurrency) {

因此,这与您的分布式解决方案无关。

instanceIndexinstanceCount在活页夹中的工作方式如下:

    if (properties.getInstanceCount() > 1) {
        shardOffsets = new HashSet<>();
        KinesisConsumerDestination kinesisConsumerDestination = (KinesisConsumerDestination) destination;
        List<Shard> shards = kinesisConsumerDestination.getShards();
        for (int i = 0; i < shards.size(); i++) {
            // divide shards across instances
            if ((i % properties.getInstanceCount()) == properties.getInstanceIndex()) {
                KinesisShardOffset shardOffset = new KinesisShardOffset(
                        kinesisShardOffset);
                shardOffset.setStream(destination.getName());
                shardOffset.setShard(shards.get(i).getShardId());
                shardOffsets.add(shardOffset);
            }
        }
    }

因此,每个使用者都在流中获得子集的子集。因此,如果分片比实例多,那么最终可能会导致某些分片没有被消耗的事实。

没有任何东西可以同时使用同一分片中的消息:每个群集只能使用一个线程。