对于AWS Kinesis的KCL Java库,如何使用requestShutdown和shutdown来执行正常关闭

时间:2017-02-22 01:48:10

标签: java aws-java-sdk amazon-kinesis-firehose amazon-kcl

我正在尝试使用Java中的KCL库的新功能来为AWS Kinesis通过注册shutdown hook来正常关闭以停止所有记录处理器,然后优雅地停止工作。新库提供了一个新的接口,需要实现记录处理器。但它如何被调用?

尝试首先调用worker.requestShutdown()然后调用worker.shutdown()并且它可以正常工作。但它是否有任何预期的使用方式。那么使用它们有什么用处及其好处?

1 个答案:

答案 0 :(得分:1)

启动消费者

您可能知道在创建Worker时,它

1)在dynamodb中创建consumer offset table

2)configured interval of time

创建租约,安排租赁接受者和续租者

如果你有两个分区,那么同一个dynamodb表中会有两条记录,这意味着分区需要租约。

例如

{
  "checkpoint": "TRIM_HORIZON",
  "checkpointSubSequenceNumber": 0,
  "leaseCounter": 38,
  "leaseKey": "shardId-000000000000",
  "leaseOwner": "ComponentTest_Consumer_With_Two_Partitions_Consumer_192.168.1.83",
  "ownerSwitchesSinceCheckpoint": 0
}

{
  "checkpoint": "49570828493343584144205257440727957974505808096533676050",
  "checkpointSubSequenceNumber": 0,
  "leaseCounter": 40,
  "leaseKey": "shardId-000000000001",
  "leaseOwner": "ComponentTest_Consumer_With_Two_Partitions_Consumer_192.168.1.83",
  "ownerSwitchesSinceCheckpoint": 0
}
  • 租赁协调员ScheduledExecutorService(称为leaseCoordinatorThreadPool
  • 负责接收和续订租约的时间表

3)然后,对于流中的每个分区,Worker创建一个内部PartitionConsumer,实际上fetches the events,然后发送到您的RecordProcessor#processRecords }。见ProcessTask#call

4)关于您的问题,您必须将IRecordProcessorFactory impl注册到worker,这会为每个ProcessorFactoryImpl提供一个PartitionConsumer }。

例如。 see example here, which might be helpful

KinesisClientLibConfiguration streamConfig = new KinesisClientLibConfiguration(
 "consumerName", "streamName", getAuthProfileCredentials(), "consumerName-" + "consumerInstanceId")
            .withKinesisClientConfig(getHttpConfiguration())
            .withInitialPositionInStream(InitialPositionInStream.TRIM_HORIZON); // "TRIM_HORIZON" = from the tip of the stream

Worker consumerWorker = new Worker.Builder()
            .recordProcessorFactory(new DavidsEventProcessorFactory())
            .config(streamConfig)
            .dynamoDBClient(new DynamoDB(new AmazonDynamoDBClient(getAuthProfileCredentials(), getHttpConfiguration())))
            .build();


public class DavidsEventProcessorFactory implements IRecordProcessorFactory {

    private Logger logger = LogManager.getLogger(DavidsEventProcessorFactory.class);

    @Override
    public IRecordProcessor createProcessor() {
        logger.info("Creating an EventProcessor.");
        return new DavidsEventPartitionProcessor();
    }
}

class DavidsEventPartitionProcessor implements IRecordProcessor {

    private Logger logger = LogManager.getLogger(DavidsEventPartitionProcessor.class);

    //TODO add consumername ?

    private String partitionId;

    private ShutdownReason RE_PARTITIONING = ShutdownReason.TERMINATE;

    public KinesisEventPartitionProcessor() {
    }

    @Override
    public void initialize(InitializationInput initializationInput) {
        this.partitionId = initializationInput.getShardId();
        logger.info("Initialised partition {} for streaming.", partitionId);
    }

    @Override
    public void processRecords(ProcessRecordsInput recordsInput) {
        recordsInput.getRecords().forEach(nativeEvent -> {
            String eventPayload = new String(nativeEvent.getData().array());
            logger.info("Processing an event {} : {}" , nativeEvent.getSequenceNumber(), eventPayload);

            //update offset after configured amount of retries
            try {
                recordsInput.getCheckpointer().checkpoint();
                logger.debug("Persisted the consumer offset to {} for partition {}",
                        nativeEvent.getSequenceNumber(), partitionId);
            } catch (InvalidStateException e) {
                logger.error("Cannot update consumer offset to the DynamoDB table.", e);
                e.printStackTrace();
            } catch (ShutdownException e) {
                logger.error("Consumer Shutting down", e);
                e.printStackTrace();
            }
        });
    }

    @Override
    public void shutdown(ShutdownInput shutdownReason) {
        logger.debug("Shutting down event processor for {}", partitionId);

        if(shutdownReason.getShutdownReason() == RE_PARTITIONING) {
            try {
                shutdownReason.getCheckpointer().checkpoint();
            } catch (InvalidStateException e) {
                logger.error("Cannot update consumer offset to the DynamoDB table.", e);
                e.printStackTrace();
            } catch (ShutdownException e) {
                logger.error("Consumer Shutting down", e);
                e.printStackTrace();
            }
        }
    }

}

//然后启动消费者

consumerWorker.run();

停止消费者

现在,当您想停止消费者实例(Worker)时,您不需要对每个PartitionConsumer进行多少处理,Worker一旦您shutdown处理要求它关闭。

  • leaseCoordinatorThreadPool要求 requestShutdown停止,负责续订和租借,等待终止。

  • 另一方面,
  • PartitionConsumer取消租赁接受者, AND 通知requestShutdown s关闭。

RecordProcessor更重要的是,如果您希望通过IShutdownNotificationAware收到通知,那么您也可以实施RecordProcessor。这种方式在遇到竞争条件时requestShutdown处理事件但工作人员即将关闭时,您仍然可以提交偏移量然后关闭。

ShutdownFuture返回RecordProcessor,然后回拨worker.shutdown

您必须在requestShutdown上实施以下方法才能获得class DavidsEventPartitionProcessor implements IRecordProcessor, IShutdownNotificationAware { private String partitionId; // few implementations @Override public void shutdownRequested(IRecordProcessorCheckpointer checkpointer) { logger.debug("Shutdown requested for {}", partitionId); } } 的通知,

IRecordProcessorFactory

但是如果你在通知之前放弃了租约,那么可能不会被调用。

您的问题摘要

  

新库提供了一个记录处理器所需的新接口   将要执行。但它是如何被调用的?

  • 实施IRecordProcessorRecordProcessorFactory
  • 然后将Worker发送到您的requestShutdown()
  

尝试首先调用worker.requestShutdown()然后   worker.shutdown(),它的工作原理。但它是否有任何预期的使用方式?

你应该private func fetchSchools(){ let requestURL: NSURL = NSURL(string: "http://somelink.com/json.txt")! let urlRequest: NSMutableURLRequest = NSMutableURLRequest(url: requestURL as URL) let session = URLSession.shared let task = session.dataTask(with: urlRequest as URLRequest) { (data, response, error) -> Void in let httpResponse = response as! HTTPURLResponse let statusCode = httpResponse.statusCode if (statusCode == 200) { print("Everything is fine, file downloaded successfully.") do{ let json = try JSONSerialization.jsonObject(with: data!, options:.allowFragments) as! [String: AnyObject] if let JSONSchools = json["skoly"] as? [[String: AnyObject]]{ for school in JSONSchools { if let name = school["nazev"] as? String { if let city = school["mesto"] as? String { if let id = school["id"] as? String { self.schools.append(School.init(name: name, city: city, id: id)) } } } } self.fetchFinished = true } } catch { print("Error with Json: \(error)") } } } task.resume() } 使用graceful shutdown,这将照顾竞争条件。它是在kinesis-client-1.7.1

中介绍的